MY YEAR-END READING STATISTICS #Wrapped
(originally for the Guardian)
MY YEAR-END READING STATISTICS #Wrapped
(originally for the Guardian)
Most people implicitly assume medical tests are infallible. If they test positive for X, they assume they have X. Or if they test negative for X, they’re confident they don’t have X. Neither is necessarily true.
Someone recently asked me why medical tests always have an error rate. It’s a good question.
A test is necessarily a proxy, a substitute for something else. You don’t run a test to determine whether someone has a gunshot wound: you just look.
A test for a disease is based on a pattern someone discovered. People who have a certain condition usually have a certain chemical in their blood, and people who do not have the condition usually do not have that chemical in their blood. But it’s not a direct observation of the disease.
“Usually” is as good as you can do in biology. It’s difficult to make universal statements. When I first started working with genetic data, I said something to a colleague about genetic sequences being made of A, C, G, and T. I was shocked when he replied “Usually. This is biology. Nothing always holds.” It turns out some viruses have a Zs (aminoadenine) rather than As (adenine).
Error rates may be very low and still be misleading. A positive test for rare disease is usually a false positive, even if the test is highly accurate. This is a canonical example of the need to use Bayes theorem. The details are written up here.
The more often you test for something, the more often you’ll find it, correctly or incorrectly. The more conditions you test, the more conditions you find, correctly or incorrectly.
Wouldn’t it be great if your toilet had a built in lab that tested you for hundreds of diseases several times a day? No, it would not! The result would be frequent false positives, leading to unnecessary anxiety and expense.
Up to this point we’ve discussed medical tests, but the same applies to any kind of test. Surveillance is a kind of test, one that also has false positives and false negatives. The more often Big Brother reads you email, the more likely it is that he will find something justifying a knock on your door.
Adrian previously discussed Working Effectively with Legacy Code when he talked about how to choose a programming language for your book. It deserves revisiting though, so here it is in the library section.
Quite simply, if you have not read this book yet, read it. If you have a colleague who has yet to read it, get them a copy. If someone asks you what one book to read about software engineering, it is this one. It is not Code Complete, Second Edition, nor is it Clean Code, nor any other book that claims to teach you how to get software right the first time around (you will not). It is not Programming Rust, nor Programming Elixir, nor any other book that claims to teach you a technology that solves all of your problems (it will not).
All of your code will either be abandoned or become legacy code, so if you do not plan on failing you should learn how to work effectively with legacy code.
The central premise to the book is that legacy code is usually not enjoyed because it is insufficiently tested. And that the way to work with it is to identify few, hopefully not too invasive, changes that allow for parts of the software to be tested in isolation. Once those parts are under test, it becomes easier to make more, perhaps more significant, changes to the parts that are tested to subdivide into smaller testable units. After enough iterations of this you have a system whose behaviour is both described and stabilised by the tests and will be easier to work with.
Why bother with all of this? For the same reason we avoid rewrites whenever a new programming language comes along: the existing software encapsulates both the current behaviour of the software system, and the desired behaviour: whatever the software is supposed to do, people have adapted to whatever it actually does and so that must be a starting point for any future work.
It is all too easy to say “oh the legacy code is really buggy, we should start from scratch” but those bugs are the way the system—not just the software, but the socio-technical system in which it is embedded—works. The ones that have been fixed are hard-fought lessons about what people want from this software. It may not be well-designed—but it may be. What gets called bad design might actually be good design from a few years ago: software design is fad-led, rather than engineering-led. But it may also be the case that a clean design is hiding under the weight of various patches and hot fixes. This is exactly the situation that Working Effectively with Legacy Code will let you take control of: fixing the software towards a clean design without having to let go of the good, valuable behaviour. And of course in this age of Agile®©, we work to the principle that the primary measure of progress is working software. The legacy software already works.
So why do software engineers prefer to start from scratch? Partly it is because programming is a monetised hobby: people enjoy writing software so they will find ways to do that for income. But it is also a matter of capability. A Masters-level course in software engineering contains nothing about reading or adapting existing software; not even anything about buy-versus-build decisions. And most professional programmers are not trained software engineers. Without education or experience at reading, understanding, and modifying existing code, programmers see it as difficult, unnecessary effort. Why waste my time learning from person-centuries of experience at solving this problem, when I can type cargo new and have something that solves 5% of the problem badly in maybe a month or two?
And that is where this book comes in. It is your secret superpower. Learn how to work effectively with legacy code and you will be faster and more capable than almost all of the software writers working today, through this one weird trick: not writing most of the software you need.
Cover photo by Adrian Kosmaczewski.
I am thirteen. [ ] What is your favourite dinosaur? NICKLAS, STOCKHOLM, SWEDEN Can you recommend a female poet I should read? TAMMY, ROME, ITALY Do you think The Red Hand Files have been a success? JONATHAN, CHOBHAM, UK Dear Steven, Nicklas, Tammy and Jonathan,With the passing of the years, that certain angle […]
The post As a quite handsome man, ‘from a certain angle and in a certain light’, how did you pull Mrs Cave? appeared first on The Red Hand Files.
Sometimes you don’t have all the math functions available that you would like. For example, maybe you have a way to calculate natural logs but you would like to calculate a log base 10.
The Unix utility bc
is a prime example of this. It only includes six common math functions:
Users are expected to know how to calculate anything else they need from there. (Inexplicably, bc
also includes a way to calculate Bessel functions.)
This post collects formulas that let you bootstrap the functions listed above into all the trig and hyperbolic functions and their inverses.
Most programming languages provide a way to compute natural logs but not logs in bases other than e. If you have a way to compute logs in base b, you can compute logs in any other base via
So, for example, you could compute the log base 10 of a number by computing its natural log and dividing by the natural log of 10.
If you have a way to calculate ex and natural logs, you can compute xy via
Since square roots correspond to exponent 1/2, you can use this to compute square roots.
If you have a way to calculate sine and cosine, you can calculate the rest of the six standard trig functions.
If you have a way to calculate inverse tangent, you can bootstrap it to compute the rest of the inverse trig functions.
Also, you can use arctan to compute π since π = 4 arctan(1).
If you have a way to compute exponentials, you can calculate hyperbolic functions.
If you can compute square roots and logs, you can compute inverse hyperbolic functions.
Up to this point this post has implicitly assumed we’re only working with real numbers. When working over complex numbers, inverse functions get more complicated. You have to be explicit about which branch you’re taking when you invert a function that isn’t one-to-one.
Common Lisp worked though all this very thoroughly, defining arc tangent first, then defining everything else in a carefully chosen sequence. See Branch cuts and Common Lisp.
The post Bootstrapping a minimal math library first appeared on John D. Cook.