Algorithms to live by
The Quid On:
Algorithms to live by
by Brian Christian and Tom Griffiths
Preview:
Algorithms to live by, one of the best science books of 2016 according to Amazon.com, Forbes, and MIT review, is a must-read for anyone who wishes to explore how our day to day lives are driven by algorithms which work hidden under the surface, and how a better understanding of these algorithms can help us lead better lives! With material sourced from economics, operations research, mathematics, and computer science, Brian Christian and Tom Griffiths explain how algorithms can be all that you need as life hacks!
About the Author(s)
Brian Christian is an author and a poet and is most known for two books which became international bestsellers: The Most Human Human, and Algorithms to live by. Tom Griffiths is a Professor of psychology at UCB and works in the area of Cognitive Science.
The Big Idea: It is not just computer science, algorithms can be applied to life, and they can help us in solving day to day problems, even those which require creative thinking
While the coining of the term algorithm is attributed to a Persian mathematician named Muhammad Ibn Musa al Khwarizmi in the 9th century, the concept and their usage had existed for centuries before him. Though the world algorithm may invoke thoughts of mysterious mathematics in many of you, the fact of the matter is that you follow algorithms in your day to day lives!
Any routine you follow to achieve a desired outcome is an algorithm. From cooking a new recipe you learnt, to driving your car! Algorithms can help us solve hard and unique problems we encounter in day to day lives which are defined by partial information, limited time and resources, unknown unknowns, etc.
You may be surprised to know that an algorithm can even help you choose the right person as a life partner!
Human cognition itself is a way to solve mathematical problems offered by our surroundings, and looking at the world with an algorithmic mindset can help us become better at solving them and also offer insights about the way we understand the world.
Read on to discover how computer science can help us discover how to look at the world, become better at understanding problems we face in our day to day lives, and how best to solve them!
In this quid, you will learn:
How to formulate a strategy to select the best candidate for a particular job
Why to make quick and better decisions, you must think of what you will regret less
Why keeping everything organized is not a good thing
How many roads should a man walk down, before you call him a man? Turns out that the answer is 37%!
In life, the problem is not which choices to make, but mostly, which choices to consider. With a world that increasingly presents to us myriad choices, the problem assumes an even greater significance.
It is not just in the realm of mathematical discourse that a problem of optimal stopping is relevant.
In life, there are many such situations where knowing when to stop, is the key to achieve a favourable outcome.
When does a person stop dating, and decide to pick "the one"?
Consider this dilemma formulated in a famous puzzle known as the "Secretary Problem". If your goal is to hire the best candidate from the pool of applicants who have applied, how many applicants must you interview before deciding which one is the best? The problem is complicated because of the conditions which imply that if you pass over a candidate and go on to the next one, you cannot return to him or her. You must choose a candidate who you are either interviewing or will interview in the future.
As it turns out, the optimal strategy for choosing the best applicant amongst the pool is to interview and pass over the first 37% of candidates, and then select the first candidate amongst the rest who is better than the previous ones! Here, the first 37% of interviewees serve as formulating a standard, which comes handy when finally selecting a candidate.
Moreover, the chances of the selected candidate being the best one is also about 37%. While a 63% chance of failure doesn’t seem inspiring, it must be kept in mind that this strategy will be the same with the same chances of success irrespective of the size of the candidate pool. If you have 50 applicants to interview, the strategy is just the same as it would be if you had 1000 candidates to interview - a success rate of 37%, in contrast to the chances of choosing the single best applicant: 2%, and 0.1%, respectively.
This strategy can be applied to various situations, ranging from picking the "right partner", to selecting the "best" apartment. The 37% rule serves to optimise your chances of making the right selection.
Aim to minimize your regret to make quick and better decisions
In life, we are often faced with situations where we have to decide to keep going on, or to quit and try something new. How does one make a decision when presented by the dilemma : to explore or exploit? Should you quit your job and start-up, or should you continue in the job and milk more money?
There is a framework which can be used for making these kinds of decisions. It is called the regret minimization framework, and it presents a way to solve dilemmas in a way which minimizes regret, thus delivering an optimal solution. Jeff Bezos has said that he employed such a framework when first considering whether to quit his job to start Amazon.com.
For those of you who like gambling, a casino presents a similar dilemma which is a famous mathematical puzzle known as the multi arm bandit problem. The problem essentially is a formulation of the explore/exploit dilemma, in the context of several slot machines. One of the ways of solving this problem is to be an optimist in the face of uncertainty! Why? The upper confidence bound algorithm tells you why.
An upper confidence bound algorithm is a way to assess whether to explore or exploit. Under its scheme of things, the choice should be to choose an option which has a wider confidence interval, even if the expected value is the same as another option. For example, if you have a slot machine which has the record of 1-1 (1 win and 1 loss) in a set of 2 pulls, its confidence interval is much wider than a machine which has 10 wins and 10 losses in a total of 20 pulls. In the second case, there is more data and information about the machine, and hence even with an expected win percentage of 50% (the same as the one with a 1-1 record), it is better to choose the former machine, since its potential for a superior record is unexplored and more unknown.
So, in essence, this tells us that one should be optimistic in choosing new experiences and new activities to minimize regret, which is a fantastic way to make quicker and better decisions.
Being messy is not necessarily a sign of laziness, but algorithms can help sort out stuff. Sorting is an upfront effort to save efforts later
Have you ever felt the frustration when you have to rummage through your drawer to find a matching pair of socks? Sorting as a concept is something which we are all familiar with, even when it operates under the radar in several situations.
Google doesn’t search your text input better than some other search engine, what it does significantly better and faster is that it sorts the top results according to relevance. Relevant sorting lies at the heart of Google's success.
However, as computer scientists would tell you, sorting isn’t required every time and in every case. Sorting as an end in itself is no virtue. It is useful only if there is a balance between the cost of sorting and the cost of searching unsorted stuff.
Afterall, if you are reasonably sure of where a book is placed, do you need to spend a long time sorting your bookshelf? Probably not.
If however, sorting is necessary, as it is in the case of large data sets, or in the cases of sports events, then a mergesort algorithm can help.
Essentially, in the mergesort method, you have to divide the data set into several parts. Sort individual parts, and then pair up two different parts with each other and sort again. Continue this till you eventually have one sorted set. This method is generally used in sports tournaments, like that of the Football Worldcup: A round of 32 (divided into 8 groups of four each), then a round of 16, the quarterfinals, semi-finals, and the finals.
Sorting is prevalent in nature as well and is visible in the manifestation of the pecking order, or social hierarchies amongst animals. A group of chickens fighting establishes a pecking order, as a sorted order of dominance, which saves future fights. This is because once dominance is established, lesser dominant members of the group avoid fights with the more dominant ones, resulting in an ordered state of things.
To figure out how often one needs what, computers use the concept of cache, so does your brain
There is always a tradeoff between storage space and access to the constituents of the storage. If you have a large file drawer, you may be able to keep a lot of files, but you may find looking for those files a cumbersome and time-consuming process.
Even if the files are sorted, the mere act of accessing the relevant position in the sorted list will consume time. In real life, this may be tolerable and not of too much importance, but when it comes to the world of computers, it assumes much greater significance.
The internet seems to work instantaneously. In reality, it, along with almost all storage software work on the principle of caching.
Quite simply, if you store what you need the most in a memory layer which is easily accessible, you can achieve significant performance improvements. The storage space in this memory layer is compromised in favour of faster access. The browser on your laptop uses caching to store temporary files which enable you to browse the internet faster, since the files required for it are already in working memory.
The Cache memory typically stores the most relevant files needed to perform a request or function. Since the cache memory storage is limited, the least recently used algorithm is applied to store relevant files and data. This means that as usage goes on, the least recently used files are deleted from the cache memory to allow for those which are used more recently. You can also apply this algorithm to figure out what to throw away and what to retain from a bunch of old clothes to make way for new ones.
The human brain has a similar system of organisation. It seems that the human brain is also optimised for operating under the conditions which we have. While our capacity for holding memories may be vast, we often have limited time in which to access those memories. Thus, it is an optimal strategy to hold those memories which are "used" often in a cache like system. Those memories which are not "accessed" regularly, fade over time, and new memories, which are more relevant take their place.
Concentrate on finishing the task at hand and forget about other distractions. The scheduling algorithms show you how
In the modern world, there are so many things to be done. Right from morning to night, our lives are full of a list of tasks or errands which need to be completed. How does one approach a list of such things to do in a fashion so as to optimise output?
Some self-help books on time management will tell you to undertake the easiest and the quickest tasks first, while some will tell you to take up the challenging ones earlier. There are several methods which one can employ towards better scheduling of tasks.
The simplest way to approach a list of tasks with an aim of reducing overdue as much as possible is to complete the ones which have a closer deadline. In other words, if you have to finish a task by tomorrow, work on completing that first before you take up something which can be submitted day after. This is the Earliest due date algorithm.
When scheduling, it must be kept in mind that sometimes minor and peripheral tasks can consume bandwidth and more important ones can be ignored. This is because of the phenomenon of priority inversion. A task of lesser priority can overwhelm key resources which are needed to perform a high priority task. This can lead to failure in performing high priority tasks. As an example, the Mars Rover (Pathfinder) suffered from priority inversion and kept ignoring the most important tasks in favour of the mid-level ones. This was later diagnosed as a priority inversion case.
The shortest processing time algorithm can also be useful if the aim is to minimize, say, the total waiting time of clients. For example, if a customer's pizza order can be prepared in 5 minutes, and another customer's lasagne takes 25 minutes to prepare, it makes sense to make and serve the pizza before preparing the lasagne. This is because in case the lasagne is made first, the total waiting time for both customers will be: 25+30=55 minutes, while if the pizza is made first, then the total waiting time for both customers will be: 5+30=35 minutes. Hence it makes sense to make and serve the pizza first.
You can predict the likelihood of events happening with the help of certain algorithms
Can we make decisions or make sense of things if we have a limited sample set to observe or gather information from?
Assume that you bought four lottery tickets, all of which turned out to be winners. Now, if you were to assess how many winning tickets are in circulation, how would you do it? Baye's suggested that while a certain and exact number cannot be estimated, a likelihood can be.
Assume that all the tickets were winners, then your four tickets would win 100% of the time. Now if only half the tickets were winners, your chances of all four winners reduce to 1/2*1/2*1/2*1/2, or nearly 6.2 %! Hence, it is almost 16 times likely that all the tickets are winners!
This method of estimating likelihoods of certain past events, given another set of events (evidence, in this case, is winning tickets) was formulated by Thomas Bayes who was a Reverend in the 18th century England.
However, Bayes estimation does not give us any idea about what the winning odds really are. If you don’t have any idea of the lottery tickets what so ever, what are your chances of winning two winning tickets, given the first ticket is a winner. Laplace's law shows that if you draw a winning ticket in your first attempt out of three, then the likelihood of winning tickets in the entire pool of tickets is 2/3. Or, the number of wins+1, divided by the number of attempts +2. This method offers actionable insight into the day to day events which may interest you. If you want to know the probability of you being late for work today, given you have been late for work X number of times in the past Y occasions, then you just need to use Laplace's law to arrive at a likelihood of you being late as : x+1/y+2.
Algorithms are also at the heart of our communications.
The internet is a bunch of wires and servers which manifest as the information superhighway on our electronic devices. The internet itself is a collection of several protocols, one of the most important one is TCP, or transmission control protocol. The internet works based on packet switches rather than circuit switches. What this means is that instead of dedicated channels used for communications, users' data is shredded into packets which form a part of the overall data flow. There are obvious differences in what such a difference implies. Simply, if your phone network is congested, you will get a message that says that all lines are busy, while if a lot of people are sending emails all over the place, it will just make the system slow.
In America, a probationary judge employed a method which involved prescribing punitive action for violating parole in the form of a short stay in Jail. The first violation required a day's stay in Jail, the second required 4, and so on. This method was in contrast to the earlier more popular method of letting violators off with a warning several times before it became essential finally to punish them with long jail sentences. Surprisingly, this method of prescribing small but exponentially increasing punishments proved helpful for reducing the number of violations overall! As it turns out, this is akin to the exponential backoff algorithm, which is used to solve congestion problems in networks.
The Exponential backoff algorithm is an elegant solution to the problem of overload. If you have experienced network failure in any form, for example, a website or a server crash, you should know that this failure can lead to a prolonged catastrophe if the exponential backoff algorithm is not applied.
Quite simply, in the event of a crash, you would refresh the page, sending the request to the server again, and so would other users. In the case of the internet, the number of "other users" can be in the millions. Hence it is entirely possible and probable that many of these requests would be simultaneous and therefore keep the server overloaded. What this algorithm does is that it sends a request at an exponential delay. The first time, say, for example, it sends a request in 1 second, the 2nd time in 2seconds, the third time in 4, and so on.
An even better algorithm is the additive increase, multiplicative decrease algorithm. This helps to avoid overload in the first place. Essentially it determines the amount of information/data you can send over a particular network, by sending packets of data one by one, incrementally. For example, one packet, followed by 2, then 3, and so on. As the system reaches the point of overload, it cuts back on the data sent by half, and then starts the process of sending incremental data packets again.
Algorithms can help you understand human actions and also enable desirable behaviour, but remember not to overfit models onto data
Why do corporate employees work long hours even when they know the importance of work-life balance? More interestingly, why do people take less holidays, even when they are allowed to take more? There have been cases where employees have been monetarily incentivized to take vacations, but still, it has not counted for much.
The answer is that employees are absolutely right from a game theoretic perspective. Each employee desires to have just a little bit less vacation time than the others, leading the entire set towards taking fewer and fewer vacations. This is because an individual employee associates lesser vacation time, with more possibilities of higher payoffs like promotions, increments, etc.
This can be termed as The Tragedy of Commons.
This is an equilibrium position, though not the best situation for all the employees involved. If, however, minimum vacation is made compulsory, the situation would improve dramatically, thus, enabling desirable behaviour.
In this quid, we have seen the surprising usefulness of algorithms in the world around us, but the thing with them is, that they can also become unduly complex. In such cases, the cost of complexity can be far higher than the benefits of accuracy.
We must avoid the temptation of idolising data and numbers to the point that they become the end goal in themselves. This is called overfitting. Doing something so excessively that the benefits from it do not justify the complexity in achieving them.
One simple way of avoiding overfitting is by constraint relaxation. Consider the question: "What do you want to do?". People would think of various things, but operating under the constraints that they do, viz. money, house, cars, social life, social expectations, etc, they can find it rather difficult to answer this question. Instead, if the question is rephrased as "What would you do if you could not fail?" It becomes easier to answer. Perhaps, one can relax the constraints to answer a question which is simpler, and then work one's way to the truth!
Final Summary
Algorithms are useful not just to mathematicians and computer scientists, they can also help us make better decisions. Our brain also works on intuitive algorithms which we refer to as heuristics. Assisted by Mathematics, we can make better and optimal choices in our day to day lives. From selecting the right apartment, to figuring out the best spot for parking, algorithms can help us in ways that we would be surprised. However, like other things, they also have their limits, and we must keep in mind that while increased complexity may lead to better accuracy, the increased accuracy may not be worth it and can lead to confusing deductions.
Comments
Post a Comment