小行星项目第2部分测试驱动开发

太空科学与Python(Space Science with Python)

前言 (Preface)

This is the 22nd part of my Python tutorial series “Space Science with Python”. All codes that are shown in the tutorial sessions are uploaded onGitHub. Enjoy!

这是我的Python教程系列“使用Python进行空间科学”的第22部分。 教程会话中显示的所有代码均已上传到GitHub上 请享用!

介绍 (Introduction)

No, we are not yet starting with any Near-Earth Object (NEO) related Python development or implementation. We need to sharpen the axe before we can cut any tree. In our case: Let’s dive into some development concepts that will lead to a sustainable long-term project and Python library.

不,我们尚未开始任何与近地对象(NEO)相关的Python开发或实现。 在切割任何树之前,我们需要削尖斧头。 在我们的案例中:让我们深入探讨一些开发概念,这些概念将导致可持续的长期项目和Python库。

How do you code?Most developers (either free-time coders or professionals) like to see quick results; they develop a prototype, some clickable interface, or demo page to achieve a first product milestone. In a professional working environment this may also be caused by external factors like deadlines, product owners, business partners or customers that are expecting results within a certain time frame. Some may talk past each other, caused by different professional backgrounds and knowledge.

你怎么编码? 大多数开发人员(业余程序员或专业人员)都喜欢看到快速的结果。 他们开发了原型,一些可单击的界面或演示页面,以实现第一个产品里程碑。 在专业的工作环境中,这也可能是由外部因素导致的,例如期限,产品所有者,业务合作伙伴或客户,它们期望在特定时间范围内取得成果。 由于不同的专业背景和知识,有些人可能会互相交谈。

The result?For example: Spaghetti Code that is not tested well enough and that is difficult to maintain, to upgrade and to understand!

结果? 例如: Spaghetti代码未经充分测试,难以维护,升级和理解!

In our science project we want to develop a reliable and sustainable Python solution. Something that can be used by amateur or professional astronomers working on NEOs. And that can also be used by developers and coders to create new functionalities that can be merged with the library.

在我们的科学项目中,我们希望开发一个可靠且可持续的Python解决方案。 从事近地天体研究的业余或专业天文学家可以使用的东西。 开发人员和编码人员也可以使用它来创建可以与库合并的新功能。

But how can we ensure (from the beginning on) the sustainability and maintainability of our upcoming library? Well, besides a proper project plan and program structure (PEP8 formatting, PEP257 and Numpy Docstrings) there is one concept from the world of agile working that appears to be tedious, somehow boring … : Test Driven Development (TDD).

但是,我们如何才能(从一开始)确保即将到来的图书馆的可持续性和可维护性? 好吧,除了适当的项目计划和程序结构( PEP8格式PEP257Numpy Docstrings )之外,敏捷工作领域还有一个概念似乎很乏味,有点无聊……:测试驱动开发(TDD)。

测试驱动开发 (Test Driven Development)

TDD reverses the “classical programming approach”. The first-coding-then-testing paradigm is changed to first-testing-then-coding. The rules appear to be simple, yet confusing:

TDD颠覆了“经典编程方法”。 先编码然后测试范式更改为编码然后测试。 规则看起来很简单,但令人困惑:

  1. Write a single (unit) test for a class / function / …

    为类/功能/…编写一个(单元)测试

  2. Write production code to succeed the failed unit test

    编写生产代码以使失败的单元测试成功

  3. Do not continue with other functionalities until all unit tests succeed

    在所有单元测试成功之前,不要继续其他功能

These are harsh rules. First, I need to define a unit test for a non-existing function. This test fails obviously. Now I develop some code to succeed the unit test. When the test passed another unit test is added, eventually for the same function again to prove its robustness or generic implementation. Did it succeed again, and again and again? Good! Your function appears to be suitable to be used later. If not, re-code it until all tests pass.

这些是苛刻的规则。 首先,我需要为一个不存在的功能定义一个单元测试。 该测试明显失败。 现在,我开发一些代码以成功进行单元测试。 当测试通过另一个单元测试时,最终再次针对相同的功能证明其健壮性或通用实现。 它一次又一次成功吗? 好! 您的功能似乎适合以后使用。 如果没有,请重新编码,直到所有测试通过。

If all unit test for a single class or function pass define and write new unit tests for the next element of your coding project.

如果针对单个类或函数的所有单元测试都通过,则为编码项目的下一个元素定义并编写新的单元测试。

Now this approach may sound boring. One wants to achieve great things in a short time! Why bothering with this tedious approach? Where is the pioneering spirit? Well even if you work on a truly innovative product or idea, you will need basic functionalities; totally independent of what you want to achieve. Testing-First ensures that your work will have a solid fundament to rely on. In the long run, a code with several initial bugs is more time consuming to maintain and e.g., if your scientific work relies on it … well … in the worst case you have to withdraw your analysis or numerical simulations.

现在,这种方法听起来很无聊。 一个人想要在短时间内取得成就! 为什么要烦恼这种乏味的方法? 开拓精神在哪里? 好吧,即使您从事的是真正创新的产品或构想,您也将需要基本功能。 完全独立于您想要实现的目标。 “测试优先”可确保您的工作具有坚实的基础。 从长远来看,具有多个初始错误的代码维护起来比较耗时,例如,如果您的科学工作依赖于它……那么……在最坏的情况下,您必须撤回分析或数值模拟。

TDD模式 (Modes of TDD)

Image for post
Photo by Kevin Ku on Unsplash
凯文·库( Kevin Ku)Unsplash上的 照片

In theory, TDD has three working / coding modes (these modes easily merge into another). Let’s assume you created a unit test, based on the product’s requirements, and you need to develop the corresponding product code:

从理论上讲,TDD具有三种工作/编码模式(这些模式很容易合并成另一种)。 假设您根据产品要求创建了单元测试,并且需要开发相应的产品代码:

  • Obvious Implementation

    明显的实施

The Obvious Implementationis the most trivial mode. Based on a unit test example you can clearly derive a generic function to solve a problem. Example: numerical values that are stored as a string need to be converted to an actual float. You immediately now that ‘23.15’ can be converted to 23.15 by applying float(‘23.15’).

显而易见的实现是最简单的模式。 根据单元测试示例,您可以清楚地派生通用函数来解决问题。 示例:以字符串形式存储的数值需要转换为实际的浮点数。 您现在可以通过应用float('23 .15')将'23 .15'转换为23.15。

  • Fake it

    假装

A unit test needs to pass! You have no clue how to do it, so you simply fake the result! You set a few constants and return the hard-coded answer the assertion test is expecting. Of course you Fake it(the result). However, based on the faked constant you try to develop backwards to the actual input. Eventually, after some thinking and research you will find the Obvious Implementation that was not obvious at the beginning.

单元测试需要通过! 您不知道如何执行此操作,因此您只需伪造结果! 您设置一些常数,并返回断言测试期望的硬编码答案。 当然,您将其伪造(结果)。 但是,基于伪常量,您尝试向后发展到实际输入。 最终,经过一番思考和研究,您会发现一开始并不明显的显而易见的实现

  • Triangulation

    三角剖分

The Fake itapproach assumes that based on faking constants and a single test the developer will eventually determine a generic solution for the unit test problem. However, if a problem is too complex, more tests are needed that cover more cases. More and more hard coded solutions help one totriangulate the actual generic solution of a particular problem. Assume you have the task to download a file from a website. The unit test calls a function that requires to return a message like “Download succeeded”, however you do not know where to begin with it. At this point you can fake the outcome. Create a function that will contain the download functionality, create a string with the entry “Download succeeded” and return it. Afterwards another unit test is added e.g., a Hash-Value of the downloaded static file. Again, you fake the result again by faking the resulting hash value of a mock-up file. More and more tests are added that ask you to check the servers response and so on. With more tests and more required functionalities you develop a code that needs to fulfil all requirements in the long run. Faked solutions will shift to generic solutions.

虚假方法假设基于虚假常量和单个测试,开发人员最终将确定单元测试问题的通用解决方案。 但是,如果问题太复杂,则需要进行更多的测试以涵盖更多的情况。 越来越多的硬编码解决方案可帮助您对特定问题的实际通用解决方案进行三角剖分。 假设您有从网站下载文件的任务。 单元测试调用一个函数,该函数需要返回诸如“下载成功”之类的消息,但是您不知道从哪里开始。 此时,您可以伪造结果。 创建一个将包含下载功能的函数,创建一个带有“下载成功”条目的字符串并返回它。 之后,添加另一个单元测试,例如,下载的静态文件的哈希值。 再次,您通过伪造模拟文件的结果哈希值来再次伪造结果。 添加了越来越多的测试,要求您检查服务器响应等。 通过更多的测试和更多所需的功能,您将开发需要长期满足所有要求的代码。 伪造的解决方案将转向通用解决方案。

重构与文档 (Refactoring & Documentation)

Finally, if the tests pass, consider to refactor your code. In Python, you shall follow the PEP8 standards that ensure a high level of code styling …

最后,如果测试通过,请考虑重构您的代码。 在Python中,您应遵循PEP8标准,以确保高级代码样式……

… provide comments to explain what your code is doing and provide a documentation in PEP257 style (and / or Numpy Doc style or other styles you find suitable):

…提供注释以解释您的代码在做什么,并提供PEP257样式(和/或Numpy Doc样式或您认为合适的其他样式)的文档:

Refactoring and documenting your code will help you also to understand your own code. Maybe some parts contain logical errors that are not covered by unit tests? Maybe you missed an important feature? Coding, refactoring and documenting go hand in hand. Programming should not be a pure abstraction of a problem, but also human readable to ensure long-life maintainability and sustainability.

重构和记录代码可以帮助您理解自己的代码。 也许某些部分包含单元测试未涵盖的逻辑错误? 也许您错过了一项重要功能? 编码,重构和文档记录齐头并进。 编程不应纯粹是对问题的抽象,而应是人类可读的,以确保长期的可维护性和可持续性。

Additionally, profiling your functions may then help you to boost the performance of your product.

此外,对功能进行性能分析可能会帮助您提高产品的性能。

结论与展望 (Conclusion & Outlook)

Today we covered some basic principles of Test Driven Development, short TDD. TDD shall help us to develop a sustainable and Python library for our Space Science NEO project that provides tested and accurate results. Before we dive deeper into the scientific part of this project, we will go through a TDD example next time. There we will go step-by-step through the mentioned TDD process since during the project you will only see the results and successful unit tests of some coding sessions (maybe an additional video blog would be useful for some interested readers?).

今天,我们介绍了测试驱动开发的一些基本原理,简称TDD。 TDD将帮助我们为空间科学NEO项目开发一个可持续的Python库,该库提供经过测试的准确结果。 在更深入地研究该项目的科学部分之前,下一次我们将介绍一个TDD示例。 因为在项目期间您只会看到某些编码会话的结果和成功的单元测试,所以我们将逐步完成所提到的TDD流程(也许对一些感兴趣的读者来说,额外的视频博客会很有用?)。

If you struggle with coding, if you are lost in your own development process, objects and functions, remember to read the Zen of Python:

如果您在编码方面遇到困难,如果迷失在自己的开发过程,对象和函数中,请记住阅读PythonZen

The Zen of Python, by Tim PetersBeautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren’t special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one — and preferably only one — obvious way to do it.
Although that way may not be obvious at first unless you’re Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it’s a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea — let’s do more of those!

Thomas

汤玛士

文献与参考(Literature & References)

翻译自: https://towardsdatascience.com/asteroid-project-part-2-test-driven-development-ed7af6c1820e