Quantitative and Physical History


First published on February 28, 2019

Quantitative and Physical History


First published on February 28, 2019

This section introduces the subject of quantitative and physical history. Empirical approaches are presented as well as a unified science of history. It is chiefly a physical model, in that it deals with physical principles, quantities, tendencies and constraints. Attempts are made to do so quantitatively where possible. It also delves into other areas such as psychology, and traditional history with the understanding that rigor and expertise by the author in those areas may be very limited.

This section introduces the subject of quantitative and physical history. Empirical approaches are presented as well as a unified science of history. It is chiefly a physical model, in that it deals with physical principles, quantities, tendencies and constraints. Attempts are made to do so quantitatively where possible. It also delves into other areas such as psychology, and traditional history with the understanding that rigor and expertise by the author in those areas may be very limited.

Table of Contents

  1. 1. Welcome to Quantitative and Physical Methods in History

    This work is broken up into four major sections. First is Basic Skills and Tools, which contains materials that may be applicable to the other sections. Second is Quantitative Methods, which involved modeling and data analysis. Third is Digital Methods, which concerns additional computing techniques such as GIS, text parsing and image analysis more common in the digital humanities. Fourth is a section on advanced visualization that will help convey and display your work to the world.

    Basic Skills and Tools

    You will need several basic skills and tools to use more advanced tools and techniques. They may not be exciting, but are essential. Even if you are familiar with the skills and tools in this section, it is worth a quick skim to refresh your mind. Also, these sections may serve as a useful reference if you get stuck on setting up the more advanced tools.


  2. 2. To What Extent Can History Be Quantitatively Modeled?

    We are concerned with developing a unified history of science, which means that we must be able to propose testable hypotheses. It is a lot easier to reject hypothesis that can be quantified. Yet given how complicated individuals are, and how much complex entire societies of many individuals must be, how could it every be possible to quantify societies? There are ways, but there are some phenomena that first require discussion.

    Regimes As Vast Numbers of People

    Large regimes are comprised of vast numbers of individuals. Even a small city might contain tens of thousands of people. Most large urban areas contain millions of people. Most powerful countries contain at least 50 million people to over one billion people in modern times.

    Even if a regime is governed by a single individual such as a monarch or dictator, the regime is nevertheless comprised of all of the individuals governed, each with their now needs, perspectives, influence and power (even if individually small).

    Individual Freedom of Action

    These thousands and millions of people each possess their own interests and scope of action. Individuals appear to have a significant scope of freedom of action, even when they have limited civil rights.

    Does the Time Make The Hero?

    Does individual freedom translate into freedom of action for the entire regime? This brings to mind an age-old question. Does the time make the hero or does the hero make the time? Consider the following two cases.

    In football, the San Francisco 49ers were a legendary football team in the 1980s. For much of that time, they were lead by a legendary quarterback, Joe Montana. In one game, the 49ers were behind with 15 seconds left in that game. Then Joe Montana threw a winning touchdown pass and the rest was history. Joe Montana was certainly a great quarterback. Yet, while acknowledging Montana’s skill, coach Bill Walsh pointed out that this last minute play had been rehearsed time and again in a comprehensive system of team training. Montana was part of that system.[1]Without that system, Montana could have thrown a great pass, but there would have been no one there to catch it.

    A second case applies to factory assembly lines.[2]  In an assembly line a conveyer belt moves an uncompleted product past a series of workers. Each worker completes a task, which is often dependent upon the already-expended efforts of workers “up-line.” What if one worker works exceptionally diligently and quickly? What will happen? If the worker processes products too fast, there will be a pile of “work-in-process” waiting in front of the next worker who is working more slowly. Unless that next worker speeds up, all that will happen is that the factory’s inventory of unfinished goods will increase, which is a waste of money and resources.  The factory will be harmed. Or the hard-working worker, dependent upon an “up-line” worker for work-in-process, will simply run out of product to work on and have idle time. In neither case do the extra efforts of the diligent worker contribute to the productivity of the factory and in one case even reduces productivity.[3]

    In a large, interdependent system, such as a large society, the conclusion here is that it is the time that makes the hero, even if the time is silent as to which individual will earn the title of hero.

    Regime As The Summation of Individual Behavior

    A regime can be viewed as the sum of the individual contributions and actions of its individuals.  When one thousand carpenters strike one thousand nails with one thousand hammers, the regime is one thousand nails-hits richer. A city is a single legal entity, but is comprised of numerous houses, factories, shops and other structures.

    This summation effect appears to tend to cancel out any effect of individual free will over material lengths of time.  Many individuals will behave in one way while many others will behave in the opposite way. Some carpenters will drive nails into boards, while others will remove nails. The larger the society, the greater this canceling effect will tend to be.

    Is a regime then completely at the mercy of historical destiny? This is not necessarily so, but the ways a regime can escape its “destiny” are limited and fairly specific in nature.

    Regimes As Producers and Consumers of Resources

    Regimes can be viewed as produces and consumers of resources. Just as an individual human requires air, water, food and other goods, so does a city albeit in larger amounts. Humans are to regimes as are cells to the human body. Great networks of blood vessels supply nutrients to individual cells and carry away waste. Networks of nerves convey information. In a contemporary society, water is carried in great aqueducts, rail lines and freeways channel in nutrients and remove garbage, and a myriad of telephone and internet lines transmit information. Such resources can be anything necessary to sustain the regime.

    Resource Exhaustion

    Some resources are partially renewable, such as agriculture production.  Others are limited and can be totally exhausted. Such resources can include fossil fuels, ground water, and old growth forests for example. Social resources can also be exhausted. In business, social resources are accounted under the term “good will” and even have a quantitative financial value placed upon them.

    Societies Dependent Upon a Nonrenewable Resource

    When a regime is substantially dependent upon a limited, nonrenewable resource, it can be modeled as a function of a normal distribution or other similar distribution. Regimes which are substantially dependent upon mining mineral reserves such as gold and silver are a prime example. Spanish governance over the New World and the mining communities of the San Juan region of Colorado were both highly dependent upon producing gold and silver. Even where a critical physical resource is not apparent, most regimes are dependent upon limited social resources and can therefore be modeled as a Hubbert curve.

    Transition Points

    A new society grows exponentially. Its people expect that exponential growth will continue. They frequently do not recognize limits to growth soon enough. Production does not match expectations, leading to social disruption. Where growth slows and expectations diverge from actual production represents a transition point and may graphically appear as an inflection point.


    [1]William Walsh et al, Finding the Winning Edge.

    [2]This example is inspired from Elihu Goldratt, The Goal.  Great Barrington, MA: North River Press, 1992

    [3] Goldratt, The Goal.


  3. 3. Quantitative Methods In History


    There are two schools of thought regarding the nature of history as a discipline. Some consider history to be among the humanities. Others consider it to be a social science. Of course, the two are not mutually exclusive. This work does not propose a quantitative approach as a replacement for narrative, scholarly approaches, but rather to provide useful additional tools.

    Entering into quantitative methods begs several questions, such as “What can be quantified?”, “How accurate and meaningful is historical quantitative data?” “What can be modeled?” and “How accurate are such models?

    Nearly anything can be quantitatively modeled, either directly or by proxy. Even love can be quantified, through proxies, such as in terms of hours a day thinking about someone or the cost an an engagement ring versus income or assets. Or perhaps even directly, via electrodes wired to the brain.

    A more penetrating matter regards the accuracy of such models. Proxies can always be found, and some data can always be found, yet it may not be abundant or precise enough to produce models of sufficient accuracy for the purposes desired. This question must be answered on a case-by-case basis, although it can be possible to make generalizations about accuracy. For example, the further one goes back into history, there is generally less abundant and accurate data. Also, was casualties, especially further back in history, are often suspect, and tend to be exaggerated either up or down, depending on the perspective of the source.

    Finally, is the cost worth it? Much data can be obtained, but it can often be costly to gather and process it. Can one afford that and is it worth the cost for the benefit obtained?

    Types of Quantities

    Many different things can be quantified, although some are more obvious than others. Battles are often quantified by the numbers of soldiers fighting on each side, as well as by casualties and reparations. There is often commercial and trade data, such as how much wheat was produced in a kingdom, or taxes on trade. There is geographic data, such as how many square kilometers a kingdom ruled, how much rainfall that area received, and how long were trade routes. Finally, there is time data, such as how long certain historical persons lived, or how long their dynasties endured.

    Types of Models

    Examples of simple models that can be easily visualized are introduced: linear, quadratic, exponential growth, logistic, and efficiency-discounted exponential growth (EDEG). Simulation tools are briefly introduced, such as pen-on-paper, MS Excel, Ruby, Python, R, Wolfram Alpha, Processing, SVG, and graphical information systems (GIS).

    Computational History

    Computational history is related to quantitative methods. Computational history is a subset of the digital humanities. Computational tools can be extremely useful for simulating, illustrating and visualizing certain aspects of history. That said, there should be a lot of thinking before computational techniques are brought into play. One can generate considerable data and even impressive graphics that don’t really mean anything, or are just plain incorrect or misleading. Remember the lessons of Merlin’s Apprentice!


    Simulation tools are briefly introduced, such as pen-on-paper, MS Excel, Ruby, Python, R, Wolfram Alpha, Processing, SVG, and graphical information systems (GIS).


    • Wikidata is a source of a great variety of data, including historical data. While the reliability and completeness can vary, it can sometimes be a good starting point. The data may be contained in a variety of different documents.
    • Wolfram Alpha can be asked questions that will sometimes produce historical data.

  4. 4. Physical Approach to History


    Imagine the hot sun shining brightly upon the Earth situated in cold space. Much light is reflected back from the Earth into space. The remainder of the light is absorbed by the Earth and heats its surface. Nature abhors temperature differences, and tries to rectify the situation as quickly as possible by having the Earth emit heat back into space. Yet the Earth’s atmosphere a good insulator. To bypass that insulation, great blobs of hot air at the surface rise wholesale into the upper atmospheric cooler regions, so the escape of heat is greatly increased, and Nature is pleased.

    Yet the light that gets reflected from the Earth is not heat nor does much to warm the coldness of space. Nature does not gladly tolerate such rogue light. So living organisms develop upon the Earth that can capture and photosynthesize some of the rogue light. Those organisms release heat or are consumed by other organisms that produce heat. Nature is still not satisfied and demands greater haste. Intelligent organisms form that can release heat faster, and civilizations form that can release heat yet faster, further pleasing Nature.

    Nature is greedy and demands all that it can seize. Just as great blobs of air form and rise through the atmosphere, dynasties and empires form in succession one after another, releasing heat that is otherwise inaccessible. History is literally a pot of water boiling on a hot stove in a cold kitchen, with dynasties and empires forming and bubbling up to the surface. Is there more that Nature can yet demand? New technologies and untapped sources of energy? New forms of civilization? Or the yet totally unknown?

    This book is intended to serve as an introduction and handbook. Rich descriptions as well as much technical detail have been omitted to improve readability and avoid confusion. Additional sources of information are cited for the reader who wishes to know more. In this book, you will envision how humans are linked to the entire universe and how we share its drive and destiny. Unfortunately, Physical History (PH) does not provide quick, easy answers to society’s challenges. Nevertheless, you will discover analytical tools as powerful as the astronomer’s telescope and the biologist’s microscope to investigate human affairs. This is a tall order to fill. It is best to remember that this book is more of a framework of perspectives and tools to help you get started, rather than an encyclopedia of answers. This is still a pioneering field. There are considerable opportunities for further contributions of the greatest significance.

    PH derives social science primarily from physics, but also from other areas such as cosmology, ecology and psychology. PH is more fundamental than social science derived merely from the observation of humans, because it views the existence of humans as the result of cosmological trends and physical processes. Likewise, PH strives to be generic, so that it can be used to describe and analyze any society anywhere and anytime, be it the Carolingian dynasty in medieval France or an extraterrestrial society across the galaxy. Observation strongly suggests that the laws of physics remain invariant across time and space, allowing for the possibility of a truly generic, non-geocentric social science derived from physical principles.

    Although PH is based upon the physical sciences, no claim is made for its ability to “produce” a perfectly deterministic science. In fact the approaches of PH are only practical because people act as individuals and have a wide freedom of action. This seems paradoxical, but that is the way things work out.

    Inner Versus Outer Philosophy

    In ancient times, natural (outer) and social (inner) philosophy were closely linked. Then, a philosopher’s view of the composition of matter might be closely linked to their view of the best type of government for society. This unity of inner and outer philosophy continued in Europe until the Renaissance.[1]However, the heliocentric universe proposed by Copernicus and the findings of imperfect heavens by Galileo were deemed inconsistent with the inner, social philosophy of that time. The resulting severance of inner and outer philosophy began in earnest and has continued to this day.

    PHE approaches social science from the perspective of outer philosophy. Both approaches are necessary for the development of a complete and meaningful social science. We are humans who attempt to develop social science. We try to be impartial, but must admit that our ability to do so is inherently limited. Motivation and incentives are always a factor in what gets studied. Why should we develop social science if it does not benefit those of us who endeavor to do so? Even physical scientists are human and have the same sort of needs that other people have. The subject of psychology and how it colors people’s reaction to PHE is discussed in a later section.

    A Unified Model

    The social sciences already utilize some quantitative methods. Economists utilize them perhaps exhaustively and several historians practice cliometrics. Nevertheless, the social sciences have lacked the type of unified model that Newton provided for the physical sciences. Ever since Newton created his three laws to describe the mechanical universe, numerous philosophers and social scientists have tried to create a mechanical model of society without success. Meanwhile, in the early 1900s, Newton’s laws of mechanics were shown to be idealizations of a much less deterministic, statistical universe. Ironically, it is the fall of Newtonian mechanics that allows for the achievement of a true “science of society.” PH is not the purely deterministic dream of early “Newtonian” sociologists. Rather PH uses concepts from modern statistical mechanics to provide a firm foundation for a fundamental understanding of history and economics.

    This book provides the skeleton of such a unified model. The Principle of Fast Entropy, an extension of the Second Law of Thermodynamics[1], is suggested as a unifying, driving principle. Just as gravity is the key force in Newton’s unified model of the physical universe, Fast Entropy is the key tendency for a unified model of the social universe. Fast Entropy is literally the “gravity” of social science. Fast Entropy applies to both the social and physical sciences. Fast Entropy can be used to analyze, understand and validate other economic and historical methodologies. It is a constraint that can be used to identify other constraints. In science, a known constraint is a valuable piece of knowledge.

    The author hopes you will find this text useful. The philosophical implications are glossed over in favor of presenting pragmatic approaches and tools. It is hoped that this work will stimulate you to develop your own ideas and approaches, for one of the fundamental characteristics of science is that it is always unfinished.

    Notes & References

    [1]H. Scott had previously proposed deriving social science from thermodynamics, in particular the works of W. Gibbs, in the 1920s.  Source: www.technocracy.org.

  5. 5. Units for historical quantities

    Quantities in Context

    A quantity typically does not mean much unless it is expressed in terms of a unit. Which is longer, 1 or 10000? It is hard to tell. Perhaps the 1 refers to a kilometer, while the 10000 refers to millimeters. Units in use have changed over the years. This section explores some of those units.

    Origin of Units


    Major units of time were the product of astronomical phenomena. The dayrepresented the rotation of the Earth with respect to the Sun. The monthrepresented the full cycle of lunar phases. The yearrepresented the full orbit of the Earth about the Sun. It is the inclination of the Earth with respect to its orbital plane that results in the seasons, which were of tremendous importance for growing crops, seasonal migrations and many other matters of importance to survival. These units are still with us and fairly universal among humans throughout history, although the exact magnitude of the unit varies over time and across societies. Of course, their names vary considerably.

    Shorter units such as the hour, minute and second are considerably more arbitrary in origin, although their modern definitions may be quite precise.

    Mass, Weight and Volume

    Units of mass and weight were sometimes derived from the means of measuring these quantities. For example, a cup is a unit of volume obviously derived from a physical cup. Other units were based on how much a hand could hold., or how much a typical person could carry in a hand-held container. Other units were based on the amount of a typical object being measured.


    Some units of length were based upon human physical characteristics, such as the length of a hand or arm. Other units were based on how far a typical person or horse could walk or fun in an hour or day.


    Ancient Units

    The following examples of historical and modern weights is not comprehensive.


    “Egyptian cubit developed around 3000 BC. Based on the human body, it was taken to be the length of an arm from the elbow to the extended fingertips. Since different people have different lengths of arm, the Egyptians developed a standard royal cubit which was preserved in the form of a black granite rod against which everyone could standardise their own measuring rods.”[1]

    “The digit was the smallest basic unit, being the breadth of a finger. There were 28 digits in a cubit, 4 digits in a palm, 5 digits in a hand, 3 palms (so 12 digits) in a small span, 14 digits (or a half cubit) in a large span, 24 digits in a small cubit, and several other similar measurements.”[1]


    The Babylonians developed measures  around 1700 BC. “Their basic unit of length was … the cubit. The Babylonian cubit (530 mm), however, was very slightly longer than the Egyptian cubit (524 mm). The Babylonian cubit was divided into 30 kus which is interesting since the kus must have been about a finger’s breadth but the fraction 1/30 is one which is also closely connected to the Babylonian base 60 number system. A Babylonian foot was 2/3 of a Babylonian cubit.”[1]


    “Harappan civilisation flourished in the Punjab between 2500 BC and 1700 BC. An analysis of the weights discovered in excavations suggests that they had two different series [of weights and measures], both decimal in nature, with each decimal number multiplied and divided by two. The main series has ratios of 0.05, 0.1, 0.2, 0.5, 1, 2, 5, 10, 20, 50, 100, 200, and 500. Several scales for the measurement of length were also discovered during excavations. One was a decimal scale based on a unit of measurement of 1.32 inches (3.35 centimetres) which has been called the “Indus inch”. Of course ten units is then 13.2 inches (33.5 centimetres) which is quite believable as the measure of a “foot”. Another scale was discovered when a bronze rod was found to have marks in lengths of 0.367 inches. …Now 100 units of this measure is 36.7 inches (93 centimetres) which is about the length of a stride”[1]

    Ancient Greece

    “The Greeks used as their basic measure of length the breadth of a finger (about 19.3 mm), with 16 fingers in a foot, and 24 fingers in a Greek cubit. These units of length, as were the Greek units of weight and volume, were derived from the Egyptian and Babylonian units. Trade, of course, was the main reason why units of measurement were spread more widely than their local areas. In around 400 BC Athens was a centre of trade from a wide area.”[1]

    Ancient Rome

    “The Romans adapted the Greek system. They had as a basis the foot which was divided into 12 inches (or ounces for the words are in fact the same). The Romans did not use the cubit but, perhaps because most of the longer measurements were derived from marching, they had five feet equal to one pace (which was a double step, that is the distance between two consecutive positions of where the right foot lands as one walks). Then 1,000 paces measured a Roman mile which is reasonably close to the British mile as used today.”[1]


    Early Europe

    “The Angles, Saxons, and Jutes brought measures such as the perch, rod and furlong. The fathom has a Danish origin, and was the distance from fingertip to fingertip of outstretched arms while the ell was originally a German measure of woollen cloth.”[1]


    Modern Units


    • Ounce, Pound
    • Inch, Foot, Yard, Mile
    • Ounce, Cup, Pint, Quart, Gallon


    • Length: Meter
    • Mass: Kilogram
    • Volume: \(Meters^3\) or Liters
    • Energy: Joules
    • Power: Watts
    • Time: Seconds



    1. U. of St. Andrews, The history of measurement
    2. Russ Rowlett, English Customary Weights and Measures
    3. U.S. National Institute of Standards and Technology (NIST), Definitions and historical context of SI

  6. 6. Basic Calculations Using Units

    Basic Calculations

    Most historical calculations will involve units. There are a few tricks that will working with units easier and more useful.

    Units Analysis

    Physicists have a secret called “units analysis”. By assigning all quantities a unit, they can then check that the final answer results in the required unit. If it does not, then there has been a mistake. Also, a unit is required for physical units to be meaningful. A weight of “100” does not mean much, whereas a weight of 1 billion points is a bit more impactful.

    Here is an example of units analysis:

    Jean walks 10 miles. It takes Jean 5 hours to walk that distance. To find Jean’s mean speed, we divide distance by time:

    speed = distance/time

    speed = 10 miles / 5 hours

    speed = 2 miles per hour.

    Miles per hour is indeed a unit of speed, so the answer could be correct. Conversely, if the answer came out to be hours/mile the answer would clearly be incorrect.

    Converting and Canceling Units

    Using units and calculating results will often involve converting one unit into another. An example will make this clear.

    A farmer has an orchard with 10 trees. During the summer, each tree provides 2 bushels of apples per week, for 3 weeks. How many bushels of apples are produced by the orchard each summer?

    bushels of apples/summer = (3 weeks/summer) x (2 bushels of apples/tree/week) x 10 trees

    bushels of apples/summer = (3 weeks/summer) x (2 bushels of apples/tree/week) x 10 trees

    bushels of apples/summer = (3/summer) x (2 bushels of apples/tree) x 10 trees

    bushels of apples/summer = (3/summer) x (2 bushels of apples) x 10


    bushels of apples/summer = (3 x 2 x 10) (bushels of apples/summer)

    Answer: 60 bushels of apples/summer



  7. 7. Significance, Uncertainty and Error


    There is a limit in the significance of quantities. Here, we are only referring to mathematical significance. For example, a quality is only significant up to half of the smallest unit being measured. For example, if you measure a distance with a meter stick, and the stick is only ruled in 1 centimeter units, then you can express the distance in terms of the smallest subdivision of the rulings: in this case, that would be 1/2 of a centimeter. So the significance would be 0.5 cm, which is three digits, e.g. 91.5 cm.


    Measurements involve a degree of uncertainly. Once again, here, we are only referring to mathematical uncertainly. Using the previous example, the distance is likely not exactly at any specific centimeter ruling. It is somewhere between centimeter marks, and it can sometimes be a bit of a judgment call to determine which is the closest mark. So the uncertainly here would be plus or minus 0.5 cm. So if the distance was measured as 91.5 cm, then the measurement would be expressed as 91.5 cm +/- 0.5 cm. This sort of error cannot typically be eliminated.

    Determining uncertainty for historical data will often be much more challenging. At this state of the field, it is reasonably to determine a rational basis for the quantity of uncertainty, include an explanation of the rationale.


    Measurements can be subject to systematic error. This type of error occurs due to a consistent flaw in the measurement system. For example, suppose the end of the meter stick was once cut off at the 1 cm mark, so that it always understates the distance by 1 cm. Such sources can sometimes be identified through examination of the measuring apparatus, and eliminated if identified.

    in history, there are some patterns that may comprise the equivalent of systematic error. First is the tendency to overestimate battle casualties, especially for the enemy. Second, it the likelihood to understate production when reported for taxes and to overestimate it when told in a good story.

    Statistical Approaches to Improving Quality of Data

    There are several statistical approaches to improving the quality of the data. If measurements can be repeated for the same quantity, such as of a distance, length or volume, then it is often possible to use a combination of individual measurements to achieve a more accurate one. Unfortunately, some historical quantities are not subject to repeat measurement. It may be possible to combine measurements of different, but similar phenomena, to improve the quality, but this approach must be handled with care and an extra degree of critical analysis and review.

    A rough measure of uncertainty is variation between several neighboring data points. While that error could be due to actual variation, some of it is likely due to error.

  8. 8. Creating And Using Models

    Main entrance flanked by two bell towers. Stonework.

    Front exterior of Notre Dame cathedral

    A model is a hypothesis about how something exists or works. A model could be a small version of something large, such as a table top copy of the Notre Dame cathedral in Paris. Such a model would represent the large, most important features of the cathedral such as the towers and flying buttresses, and possibly representations of some of the more distinctive smaller features such as the stained glass windows.

    A model can also be one or a set of mathematical equations that relate one quantity to another. For example, an equation could relate dynasty power to time. Such a model could be refines, such as to represent central versus regional power. We will primarily be concerned with creating quantitative models.

    Creating a  quantitative model is reallyeasy. Just relate two quantities to one other. For example, write the following equation:

    \(quantity~of~Roman~empire~soldiers = year~in~CE\).

    According that his model, the number of soldiers in the Roman empire is equal to the year in current era years. So in 100 CE (AD), the number of Roman imperial soldiers would be 100. This certainly isa model, because it produces results that can be compared with actual data. Historians evaluate the validity of such data, which may come from literary or archeological sources, and then can compare it with the model. A range of uncertainly is estimated. If the model produces a result that notwithin the range of uncertainty for the data, the model is rejected or revised. If the model fits within the range, then it is valid, although not necessarily absolutely correct (no model ever gets proved) or representative of ultimate truth. Generally, models that fit the data the best and are consistent with other valid models tend to be more accepted.

    It often requires several attempts to get a valid model, and many attempts to obtain better ones. The above example concerning Roman soldiers can be quickly rejected using commonly available data.

    Models can be improved by including additional terms. and changing parameters. For example, adding a baseline number of soldiers, and then a term that might take into account the growth of mercenaries might improve the accuracy of the model.

    In history, often the available or accepted data is limited, and uncertainties may be high. So initially, a more pragmatic approach may be to propose models and explore to what extent they might be valid.

  9. 9. Basic Modeling Techniques

    Below are several types of function commonly used in modeling.

    Linear Models (Straight lines)

    A linear model is the simplest form of quantitative model (aside from a single point).

    Mathematically, such a model takes the form of:

    \(y=mx + b\),

    where \(y\), is a variable \(y=x\), is a variable\(m\) is a constant number that determines the slope of the line, and  \(b\) is a constant number called the y-intercept.


    Graphical Interpretation

    Cartesian coordinates are two lines that intersect at a right angle, where the horizontal line represents the x axis and the vertical line represents the y axis of a graph (more accurately called a plot). Such represents the most common type of graph. If a linear model is plotted upon Cartesian coordinates, it appears as a straight line.



    For example, the simplest approach to model the power progression of a dynasty is a combination of two linear models. This requires start and end years of the dynasty, and its peak year as input parameters. For a single dynasty, the magnitude of the peak can be set to a nominal value of 1. Simply calculate (or draw) a straight line from the relative power value 0 from the start year of the dynasty to 1 at the peak year. How to select the peak year is less clear. A naïve approach would be to pick the chronological midpoint. An objective selection would be the date of maximum economic production, if that datum is available. Or one could choose a reasonable event, although this is less objective. For example, we will choose 1796, the year of the death of Catherine the Great, one of the most powerful rulers of Russia (Mazour and Peoples 1975). Figure (below, left) shows this first line.

    Next, calculate a straight line from magnitude 1 at the peak year to magnitude 0 at the end year, involving the slope and vertical-intercept, shown on Figure 1 (below, right). This model has several disadvantages among which it is discontinuous at its peak and assumes linear growth and decay. Further, it tells us little about what underlying causes and factors may be.

    Left: one line rising. Right: Line rising to peak around 1810; second line falling to zero around 1920..

    Left: one linear model. Right: two linear models

    1.2 Inverted quadratic function

    The next simplest approach is to use an inverted parabola function. Geologist M. King Hubbert considered this approach when he modeled peak oil (Hubbert 1980). As before, one can nominally set the peak magnitude. Here, one does not take the peak year as a parameter, but rather sets suitable parameters to that the parabola intersects the x-axis at the start and end dates of the dynasty (see Figure 2). This approach produces models with a rapid growth and decay yet with a relatively long period of relative stability in the midst. A chief disadvantage is that the symmetry of this function forces one to assume a peak year in the mid-point of the dynasty’s life. Also, this model tells us very little about underlying causes.


    Plot steeply rises levels off, peaks, slowly then quickly declines.

    Romanov dynasty modeled by an inverted parabola


    1.3 Normal distribution

    A more sophisticated approach is to model the dynasty as a normal distribution, or a mathematically similar function. Hubbert utilized bell-shaped plots for peak oil (Hubbert 1956, 1980). Such a function suggests that a dynasty comprises a collection of semi-random, resource-related events that cluster about the dynasty’s peak. This approach can account for resource-based factors, where there is a degree of randomness in the ability to obtain such resources. For example, if there is a resource such as a critical mineral or petroleum where discoveries are to some degree by chance, this approach begins with few mining events, picks up with growth, then levels off with the difficulty of finding new deposits (Ciotola 1997). A chief disadvantage is that this model forces symmetry upon the dynasty’s rise and fall. Another disadvantage is that one must literally begin with the peak and work to the endpoints. It provides no readily apparent means to a priori simulate the emergence of a dynasty.

    The normal distribution approach also requires selecting a standard deviation value. Since the vertical axis represents power, and the area under the plot represents cumulative power, then one can set a standard deviation to produce the corresponding percentage of cumulative power. One can also determine the ratio of peak-to-start power, and use that to set the end-points. Note that a smaller standard deviation results in a sharper peak, whereas a greater one produces models that approach an inverted parabola (see below figure). This characteristic can be used to reject particular models based on available evidence.

    Three separate plots. Smaller standard deviation relative to lifetime produces thinner peaks.

    Normal distribution form models for power of Romanov dynasty


    Maxwell-Boltzmann distribution

    The Maxwell-Boltzmann distribution rises quickly then declines slowly. (It is possible to alter this distribution so that the opposite occurs). This distribution involves a quadratic function leading to a rise, and exponential decay leaving to a fall. So it is a qualitatively reasonable candidate to model the progression of a dynasty. The Maxwell-Boltzmann distribution has most of the advantages of the normal distribution, and it allows asymmetry. Disadvantages may include that this model is less simple both conceptually and mathematically, and requires more parameters than the approaches discussed above. An initial attempt to model the rise and fall (Ciotola 1995) of the Colorado San Juan mining region is shown in Figure 4. The parameters were adjusted to fit several data points provided by a historical account of the region by D. Smith (1982).

    Curve rises from zero 9n 1868 to 9.5 million $ in 1900 down to zero in 1920

    Maxwell-Boltzmann distribution model of gold and silve production in the San Juans



    • Ciotola, M. 1997. San Juan Mining Region Case Study: Application of Maxwell-Boltzmann Distribution Function. Journal of Physical History and Economics 1.
    • M. King Hubbert. 1956. Nuclear Energy And The Fossil Fuels. Houston, TX: Shell Development Company, Publication 95.
    • Mazour, A. G., and J. M. Peoples. 1975. Men and Nations, A World History, 3rd Ed. New York: Harcourt, Brace, Jovanovich.
    • Smith, D. A. 1982. Song of the Drill and Hammer: The Colorado San Juans, 1860–1914. Colorado School of Mines Press.

  10. 10. Fitting Models to Data

    What Is Fitting?

    For the moment, let assume that we have a validated, precisely known data set. However, we don’t know the relationship between the data, its trends or driving tendencies. So we decide to develop a model to gin a deeper understanding. Generating a model is easy. Personal income = (5 * personal height) + 6. There. Done! Yet to what extent is it a valid model? There are tools for that. In fact, modeling is often a process of adjusting the model function and parameters until the model fits the data well. Fitting models to data typically involves adjusting parameters, including initial conditions, so that values comprising the model are closer to data values. Sometime the underlying function itself needs to be changed, as long as that does not violate the principle under which the model was derived.

    Total Error Minimization Techniques

    Basic Notation

    The following notation will be used for the below methods.

    • \(Y_t\) value of a time series at period \(t\)
    • \(\widehat{Y}_t  = \)model value of \(Y_t\)
    • \(e_t = Y_t – \widehat{Y}_t\) = residual, or model error

    Error, or residual, for each forecast period. \(e_t = Y_t-\widehat{Y}_t\)

    • \(e_t = \) model error in time \(t\)
    • \(Y_t= \) actual value in period \(t\)
    • \(\widehat{Y}_t= \) model value for time period \(t\)

    Mean Absolute Deviation

    Mean Absolute Deviation (MAD) involves averaging the magnitudes of the absolute value of each error. It is useful for expressing error in the same units as the underlying data. \(MAD = \frac{1}{n}\sum_{t=1}^{n} \lvert Y_t – \widehat{Y}_t \rvert \)

    Mean Squared Error

    The Mean Squared Error (MSE) technique is minimizing the sum of the squares. For each term of data, calculate the value the model would produce (e.g. for each point of time). Take the square of difference between calculated and actual value. Add up all of those squares. Adjust your unknown parameters to produce the smallest value for the sum of the squares. This will be your best fit model. This technique penalizes large errors, so it may not be practical with unreliable data. \(MSE = \frac{1}{n}\sum_{t=1}^{n} \big(Y_t – \widehat{Y}_t \big) \)

    Mean Absolute Percentage Error

    Mean absolute percentage error (MAPE) is useful when expressing errors as a percentage is desired. This technique may be especially useful for efficiency-discounted exponential growth (EDEG) models where there is often tremendous variation the y value. \(MAPE = \frac{1}{n}\sum_{t=1}^{n} \frac{\lvert Y_t – \widehat{Y}_t \rvert}{Y_t} \)

    Characteristic Minimization Goals

    Instead of reducing total error, you might with to minimize the difference as regards a particular characteristic. For example, a priority might be to match your model’s peak with that of the data. Or you may wish to match particular events in time.


    Presentation of fitting quantitative techniques adapted from John E. Hanke and Dean W. Wichern, Business Forecasting, Eighth Edition. Pearson Prentice Hall 2005.

  11. 11. Exponential functions