Pragmatic Programmer: Chapter 7

PhaseSmith·2022년 4월 2일

Pragmatic Programmer 독서 일지

목록 보기

9/12

Conventional wisdom says that once a project is in the coding phase, the work is mostly mechanical, transcribing the design into executable statements. We think that this attitude is the single biggest reason that software projects fail, and many systems end up ugly, inefficient, poorly structured, unmaintainable, or just plain wrong. (p.324)

Topic 37. Listen to your lizard brain

Instincts are signals that need to be dealt with.

The following are situations where your instincts might signal something:

Fear of the blank page:

2 causes:

the lizard brain is trying to tell you that there’s some kind of doubt lurking just below the surface of perception

When you feel a nagging doubt, or experience some reluctance when faced with a task, it might be that experience trying to speak to you. Heed it. (p.328)
afraid to make mistakes

Fighting yourself:

When coding feels like walking uphill in the mud, it’s telling you that “this is harder than it should be.”

Whatever the reason, your lizard brain is sensing feedback from the code, and it’s desperately trying to get you to listen. (p. 329)

Tip 61. Listen to your inner lizard

If you get a instinctual signal, stop and think about.

leave your code and take a walk
externalize the issue: explain your code to a coworker
prototyping: explore other aspects of the code
Do the following.
1. Write “I’m prototyping” on a sticky note, and stick it on the side of your screen.
2. Remind yourself that prototypes are meant to fail. And remind yourself that prototypes get thrown away, even if they don’t fail. There is no downside to doing this.
3. In your empty editor buffer, create a comment describing in one sentence what you want to learn or do.
4. Start coding. (p. 331)
read other people’s code, and take note looking for their thinking patterns

Topic 38. Programming by Coincidence

Program delibrately: there should always be a reason for what you are coding

If you program by coincidence, once you fail, you won’t know why, because you never knew how it worked in the first place.

Suppose you call a routine with bad data. The routine responds in a particular way, and you code based on that response. But the author didn’t intend for the routine to work that way—it was never even considered. When the routine gets “fixed,’’ your code may break. In the most extreme case, the routine you called may not even be designed to do what you want, but it seems to work okay. Calling things in the wrong order, or in the wrong context, is a related problem. (p. 335)

Why you should think twice when a code just happens to work:

It may not really be working—it might just look like it is.
The boundary condition you rely on may be just an accident. In different circumstances (a different screen resolution, more CPUcores), it might behave differently.
Undocumented behavior may change with the next release of the library.
Additional and unnecessary calls make your code slower.
Additional calls increase the risk of introducing new bugs of their own.

For code you write that others will call, the basic principles of good modularization and of hiding implementation behind small, well-documented interfaces can all help. A well-specified contract (see Topic 23, Design by Contract) can help eliminate misunderstandings.
For routines you call, rely only on documented behavior. If you can’t, for whatever reason, then document your assumption well. (p. 336-337)

Don’t assume it, prove it. (p. 338)

Tip 62. Don’t program by coincidence

Assumptions that aren’t based on well-established facts are the bane of all projects.

Programming deliberately:

Always be aware of what you’re doing
You must be able to clearly explain what you code
Don’t code in the dark: if you’re not sure why it works, you won’t know why it fails.
Proceed from a plan
Rely on assumptions that are absolutely true. If there is nothing reliable, assume the worse
Document your assumptions
Don’t guess; try it.
Prioritize your effort: spend time on the important parts
Don’t let existing code dictate future code: all code can be replaced if it’s no longer appropriate (refactoring)

Topic 39. Algorithmic Speed

Estimating the order of many algorithms:

Simple loops: $O(n)$
ex) exhaustive searches, finding min/max values, generating checksums
Nested loops: $O(mn)$ where m and n are the limits of the two loops.
ex) simple sorting loops
Binary Chop: $O(log(n))$
ex) binary search, traversing a binary tree, finding the first set bit in a machine word
Divide and Conquer: $O(n\ log(n))$ → Algorithms that partition their input, work on the two halves independently, and then combine the result
ex) quicksort: partitions the data into two halves and recursively sorts each
Combinatoric: $O(n!)$
ex) traveling salesman problem, optimally packing things in a container, partitioning a set of numbers so that each set has the same total → Often, heuristics are used to reduce the running times of these types of algorithms in particular problem domains.

If that loop contains an inner loop, then you're looking at O(mn). You should be asking yourself how large these values can get. If the numbers are bounded, then you'll know how long the code will take to run. If the numbers depend on external factors (such as the number of records in an overnight batch run, or the number of names in a list of people), then you might want to stop and consider the effect that large values may have on your running time or memory consumption. (p. 350)

Tip 63. Estimate the order of your algorithms

If you have an algorithm that is $O(n^2)$ try to find a divide and conquer approach that will take you down to $O(n\ log(n))$ .

If you're not sure how long your code will take, or how much memory it will use, try running it, varying the input record count or whatever is likely to impact the runtime. Then plot the results.

Tip 64. Test your estimates

You also need to be pragmatic about choosing appropriate algorithms—the fastest one is not always the best for the job. Given a small input set, a straightforward insertion sort will perform just as well as a quicksort, and will take you less time to write and debug. You also need to be careful if the algorithm you choose has a high setup cost. For small input sets, this setup may dwarf the running time and make the algorithm inappropriate. (p. 351)

Premature optimization: make sure an algorithm is really a bottleneck before investing precious time trying to improve it.

Every developer should have a feel for how algorithms are designed and analyzed. Robert Sedgewick has written a series of accessible books on the subject (Algorithms or An Introduction to the Analysis of Algorithms etc.). We recommend adding one of his books to your collection, and making a point of reading it.
For those who like more detail than Sedgewick provides, read Donald Knuth’s definitive Art of Computer Programming books, which analyze a wide range of algorithms.
The Art of Computer Programming, Volume 1: Fundamental Algorithms
The Art of Computer Programming, Volume 2:Seminumerical Algorithms
The Art of Computer Programming, Volume 3: Sorting and Searching
The Art of Computer Programming, Volume 4A: Combinatorial Algorithms, Part 1 (p. 351-352)

Topic 40. Refactoring

Well, software doesn’t quite work that way. Rather than construction, software is more like gardening—it is more organic than concrete. You plant many things in a garden according to an initial plan and conditions. Some thrive, others are destined to end up as compost. You may move plantings relative to each other to take advantage of the interplay of light and shadow, wind and rain. Overgrown plants get split or pruned, and colors that clash may get moved to more aesthetically pleasing locations. You pull weeds, and you fertilize plantings that are in need of some extra help. You constantly monitor the health of the garden, and make adjustments (to the soil, the plants, the layout) as needed. (p. 354-355)

Definition of Refactoring:

Disciplined technique for restructuring an existing body of code, altering its internal structure without changing its external behavior.

[R]efactoring is a day-to-day activity, taking low-risk small steps, more like weeding and raking. (p. 356)

When to refactor:

Duplication: violation of DRY principle
Nonorthogonal design
Outdated knowledge: Things change, requirements drift, and your knowledge of the problem increases. Code needs to keep up.
Usage: realize some features are more important than others while some the opposite.
Performance: move functionality to improve performance
The tests pass: after testing, tidy up the code

Tip 65. Refactor Early, Refactor Often

How to refactor:

Don’t try to refactor and add functionality at the same time.
Make sure you have good tests before you begin refactoring. Run the tests as often as possible. That way you will know quickly if your changes have broken anything.
Take short, deliberate steps: move a field from one class to another, split a method, rename a variable. Refactoring often involves making many localized changes that result in a larger-scale change. If you keep your steps small, and test after each step, you will avoid prolonged debugging.

Topic 41. Test to Code

Tip 66. Testing is not about finding bugs

We believe that the major benefits of testing happen when you think about and write the tests, not when you run them. (p. 361)

Writing a test makes us look at our code as if we were a client of the code.

Tip 67. A test is the first user of your code

Testing is vital feedback that guides your coding.

Making the code testable forces you to reduce coupling within the code.

Creating test for your code will force you to understand your code.

Test-Driven Development(TDD):

Decide on a small piece of functionality you want to add.
Write a test that will pass once that functionality is implemented.
Run all tests. Verify that the only failure is the one you just wrote.
Write the smallest amount of code needed to get the test to pass, and verify that the tests now run cleanly.
Refactor your code: see if there is a way to improve on what you just wrote (the test or the function). Make sure the test still pass when you’re done.

→ this cycle should be very short: a matter of minutes, so that you’re constantly writing and then getting them to work.

Be careful not to fall in the traps of overdoing TDD.

spending too much time ensuring that the test has 100% coverage
writing redundant tests
designs tend to start at the bottom and work their way up.

Tip 68. Build End-to-End, Not Top-Down or Bottom Up

Top-down: Start with the overall problem you’re trying to solve and break it into a small number of pieces. Then break each of these into smaller pieces, and so on, until you end up with pieces small enough to express in code

→ 단점: it’s impossible to express the whole requirement up front

Bottom up: Produce a layer of code that gives some abstractions that are closer to the problem they are trying to solve. Then they add another layer, with higher-level abstractions, and keep on adding until the final layer is an abstraction that solves the problem.

→ 단점: difficult to decide on functionality without knowing the direction of the whole development.

End-to-End:

We strongly believe that the only way to build software is incrementally. Build small pieces of end-to-end functionality, learning about the problem as you go. Apply this learning as you continue to flesh out the code, involve the customer at each step, and have them guide the process. (p. 366)

TDD is important, but you always need to be aware of the big picture.

Unit Testing:

Testing done on each module, in isolation, to verify its behavior.

The unit test will establish some artificial environment, then invoke routines in the module being tested. Then, checks the result that are returned, either against knonw values or against the results from previous runs of the same test (regression testing)

We can use the same unit test facilities to test the system as a whole.

Testing against contract:

Unit testing is like testing against contract. → write test cases that ensure that a given unit honors its contract.
→ this will reveal one of the 2 things:

does the code meet the contract
does the contract mena what we think it means

We can test the pre-conditions and post-conditions of the contract and boundary cases.

To test modules that are dependent on other submodules, we need to test the subcomponents of a module first. We can narrow down the cause of the bug by checking that the submodules work as expected.

Tip 69. Design to Test

Ad hoc testing: an informal or unstructured software testing type that aims to break the testing process in order to find possible defects or errors at an early possible stage. Ad hoc testing is done randomly and it is usually an unplanned activity which does not follow any documentation and test design techniques to create test cases.

Build a test window:

We can provide various views into the internal state of a module, without using the debugger.

Log files containing trace messages are one such mechanism. Log messages should be in a regular, consistent format

“Hot key” sequence or magic URL: When this particular combination of keys is pressed, or the URL is accessed, a diagnostic control window pops up with status messages and so on

Use feature switches to enable extra dignostics for a particular user or class of users.

Tip 70. Test your software, or your users will

Topic 42. Property-Based Testing

Contracts, Invariants, and Properties:

Contracts: certain guarantees bout the output

Invariants: things that remain true about some piece of state when it’s passed through a function.

Properties: Contracts & Invariants → use for automate testing.

Tip 71. Use Property-based Tests to validate your assumptions

In Python, you can use hypothesis and pytest for automated testing.

The power of property-based tests: setting up rules for generating inputs and assertions for validating output will help reveal wrong assumptions.

→ problem: tricky to pin down what failed

→ solution: find out what parameters it was passing to the test function, and then use those values to create a separate, regular, unit test. This has 2 benefits:

let’s you focus in on the problem without all the additional calls being made into your code by the property-based framework
acts as a regression test: force the values used in the randomly generated property-based test inputs to be used again

The same is true of property-based tests, but in a slightly different way. They make you think about your code in terms of invariants and contracts; you think about what must not change, and what must be true. This extra insight has a magical effect on your code, removing edge cases and highlighting functions that leave data in an inconsistent state.
We believe that property-based testing is complementary to unit testing: they address different concerns, and each brings its own benefits. If you’re not currently using them, give them a go. (p. 384)

Topic 43. Stay safe out there

The next thing you have to do is analyze the code for ways it can go wrong and add those to your test suite. You’ll consider things such as passing in bad parameters, leaking or unavailable resources; that sort of thing. (p. 386)

Security basic principles:

Minimize attack surface area
Principle of least privilege
Secure Defaults
Encrypt Sensitive Data
Maintain Security Updates

Tip 72. Keep it simple and minimize attack surfaces

Minimize attack surface area:

Attack surface area: the sum of all access points where an attacker can enter data, extract data, or invoke execution of service.

Code complexity leads to attack vectors: less code means fewer bugs, fewer opportunities for a crippling security hole.
Input data is an attack vector: Never trust data from an external entity, always sanitize it before passing it on to a database, view rendering, or other processing.
Unauthenticated services are an attack vector: by their very nature, any user anywhere in the world can call unauthenticated services, so barring any other handling or limiting you’ve immediately created an opportunity for a denial-of-service attack at the very least
Authenticated services are an attack vector: Keep the number of authorized users at an absoluteminimum. Cull unused, old, or outdated users and services.
Output data is an attack vector: Don’t give away information. Make sure that the data you report is appropriate for the authorization of that user. Truncate or obfuscate potentially risky information such as Social Security or other government
ID numbers.
Debugging info is an attack vector: make sure any “test window” and runtime exception reporting is protected

Principle of least privilege:

Don’t grab the highest permission level, such as root or admin immediately. If that high level is needed, take it, do the minimum amount of work, and relinquish your permission quickly to reduce the risk.

Secure Defaults:

The default setting on the application or service should be the most secure values.

Encrypt sensitive data:

Don’t leave sensitive data in plain text. Don’t check in secrets, API keys, SSH keys, encryption passwords or other credentials alongside your source code in version control.

Keys and secrets need to be managed separately, generally via config files or environment variables as part of build and deployment.

You want to encourage long, random passwords with a high degree of entropy. Putting artificial constraints limits entropy and encourages bad password habits, leaving your user’s accounts vulnerable to takeover. (p. 392)

Tip 73. Apply security patches quickly

Maintain security updates:

The largest data breaches in history were caused by systems that were behind on their updates. Always update security.

Never do cryptography by yourself. It will probably fail.

As we’ve said elsewhere, rely only on reliable things: well-vetted, thoroughly examined, well-maintained, frequently updated, preferably open source libraries and frameworks. (p. 393)

Topic 44. Naming things

Naming is important because it reveals about your intent and belief.

Things should be named according to the role they play in the code.

Pause and think “what is my motivation to create this?”

This is a powerful question, because it takes you out of the immediate problem-solving mindset and makes you look at the bigger picture. When you consider the role of a variable or function, you’re thinking about what is special about it, about what it can do, and what it interacts with. Often, we find ourselves realizing that what we were about to do made no sense, all because we couldn’t come up with an appropriate name. (p. 396)

When naming things, you’re constantly looking for ways of clarifying what you mean, and that act of clarification will lead you to a better understanding of your code as you write it. (p. 397)

Follow naming rules acceptable in the language you are using.

Be consistent on naming. → have a project glossary, listing the terms that have special meaning to the team.

Tip 74. Name well; rename when needed

PhaseSmith

우리는 데이터와 하나다

이전 포스트

Pragmatic Programmer: Chapter 6

다음 포스트