Numeral system
2008/9 Schools Wikipedia Selection. Related subjects: Mathematics
Numeral systems by culture | |
---|---|
Hindu-Arabic numerals | |
Indian Eastern Arabic Khmer |
Indian family Brahmi Thai |
East Asian numerals | |
Chinese Suzhou Counting rods |
Japanese Korean |
Alphabetic numerals | |
Abjad Armenian Cyrillic Ge'ez |
Hebrew Greek (Ionian) Āryabhaṭa |
Other systems | |
Attic Babylonian Egyptian Etruscan |
Mayan Roman Urnfield |
List of numeral system topics | |
Positional systems by base | |
Decimal (10) | |
2, 4, 8, 16, 32, 64 | |
1, 3, 9, 12, 20, 24, 30, 36, 60, more… | |
A numeral system (or system of numeration) is a mathematical notation for representing numbers of a given set by symbols in a consistent manner. It can be seen as the context that allows the numeral "11" to be interpreted as the binary numeral for three, the decimal numeral for eleven, or other numbers in different bases.
Ideally, a numeral system will:
- Represent a useful set of numbers (e.g. all whole numbers, integers, or real numbers)
- Give every number represented a unique representation (or at least a standard representation)
- Reflect the algebraic and arithmetic structure of the numbers.
For example, the usual decimal representation of whole numbers gives every whole number a unique representation as a finite sequence of digits, with the operations of arithmetic (addition, subtraction, multiplication and division) being present as the standard algorithms of arithmetic. However, when decimal representation is used for the rational or real numbers, the representation is no longer unique: many rational numbers have two numerals, a standard one that terminates, such as 2.31, and another that recurs, such as 2.309999999... . Numerals which terminate have no non-zero digits after a given position. For example, numerals like 2.31 and 2.310 are taken to be the same, except in the experimental sciences, where greater precision is denoted by the trailing zero.
Numeral systems are sometimes called number systems, but that name is misleading, as it could refer to different systems of numbers, such as the system of real numbers, the system of complex numbers, the system of p-adic numbers, etc. Such systems are not the topic of this article.
Types of numeral systems
The most commonly used system of numerals is known as Hindu-Arabic numerals, and two great Indian mathematicians could be given credit for developing them. Aryabhatta of Kusumapura who lived during the 5th century developed the place value notation and Brahmagupta a century later introduced the symbol zero.
The simplest numeral system is the unary numeral system, in which every natural number is represented by a corresponding number of symbols. If the symbol / is chosen, for example, then the number seven would be represented by ///////. Tally marks represent one such system still in common use. In practice, the unary system is normally only useful for small numbers, although it plays an important role in theoretical computer science. Also, Elias gamma coding which is commonly used in data compression expresses arbitrary-sized numbers by using unary to indicate the length of a binary numeral.
The unary notation can be abbreviated by introducing different symbols for certain new values. Very commonly, these values are powers of 10; so for instance, if / stands for one, - for ten and + for 100, then the number 304 can be compactly represented as +++ //// and number 123 as + - - /// without any need for zero. This is called sign-value notation. The ancient Egyptian system is of this type, and the Roman system is a modification of this idea.
More useful still are systems which employ special abbreviations for repetitions of symbols; for example, using the first nine letters of our alphabet for these abbreviations, with A standing for "one occurrence", B "two occurrences", and so on, we could then write C+ D/ for the number 304. The numeral system of English is of this type ("three hundred [and] four"), as are those of virtually all other spoken languages, regardless of what written systems they have adopted.
More elegant is a positional system, also known as place-value notation. Again working in base 10, we use ten different digits 0, ..., 9 and use the position of a digit to signify the power of ten that the digit is to be multiplied with, as in 304 = 3×100 + 0×10 + 4×1. Note that zero, which is not needed in the other systems, is of crucial importance here, in order to be able to "skip" a power. The Hindu-Arabic numeral system, borrowed from India, is a positional base 10 system; it is used today throughout the world.
Arithmetic is much easier in positional systems than in the earlier additive ones; furthermore, additive systems have a need for a potentially infinite number of different symbols for the different powers of 10; positional systems need only 10 different symbols (assuming that it uses base 10).
The numerals used when writing numbers with digits or symbols can be divided into two types that might be called the arithmetic numerals 0,1,2,3,4,5,6,7,8,9 and the geometric numerals 1,10,100,1000,10000... respectively. The sign-value systems use only the geometric numerals and the positional system use only the arithmetic numerals. The sign-value system does not need arithmetic numerals because they are made by repetition (except for the Ionic system), and the positional system does not need geometric numerals because they are made by position. However, the spoken language uses both arithmetic and geometric numerals.
In certain areas of computer science, a modified base-k positional system is used, called bijective numeration, with digits 1, 2, ..., k (k ≥ 1), and zero being represented by the empty string. This establishes a bijection between the set of all such digit-strings and the set of non-negative integers, avoiding the non-uniqueness caused by leading zeros. Bijective base-k numeration is also called k-adic notation, not to be confused with p-adic numbers. Bijective base-1 the same as unary.
Bases used
In computing
Switches, mimicked by their electronic successors built originally of vacuum tubes and in modern technology of transistors, have only two possible states: "open" and "closed". Substituting open=1 and closed=0 (or the other way around) yields the entire set of binary digits. This base-2 system (binary) is the basis for digital computers. It is used to perform integer arithmetic in almost all digital computers; some exotic base-3 ( ternary) and base-10 computers have also been built, but those designs were discarded early in the history of computing hardware.
Modern computers use transistors that represent two states with either high or low voltages. The smallest unit of memory for this binary state is called a bit. Bits are arranged in groups to aid in processing, and to make the binary numbers shorter and more manageable for humans. More recently these groups of bits, such as bytes and words, are sized in multiples of four. Thus base 16 (hexadecimal) is commonly used as shorthand. Base 8 (octal) has also been used for this purpose.
A computer does not treat all of its data as numerical. For instance, some of it may be treated as program instructions or data such as text. However, arithmetic and Boolean logic constitute most internal operations. Whole numbers are represented exactly, as integers. Real numbers, allowing fractional values, are usually approximated as floating point numbers. The computer uses different methods to do arithmetic with these two kinds of numbers.
Five
A base-5 system ( quinary) has been used in many cultures for counting. Plainly it is based on the number of fingers on a human hand. It may also be regarded as a sub-base of other bases, such as base 10 and base 60.
Eight
A base-8 system ( octal) was devised by the Yuki of Northern California, who used the spaces between the fingers to count. Zero to seven are the only possible digits. There is also linguistic evidence which suggests that the Bronze Age Proto-Indo Europeans (from whom most European and Indic languages descend) might have replaced a base 8 system (or a system which could only count up to 8) with a base 10 system. The evidence is that the word for 9, newm, is suggested by some to derive from the word for 'new', newo-, suggesting that the number 9 had been recently invented and called the 'new number' (Mallory & Adams 1997).
Ten
The base-10 system (decimal) is the one most commonly used today. It is assumed to have originated because humans have ten fingers. These systems often use a larger superimposed base. See Decimal superbase.
Twelve
Base-12 systems ( duodecimal or dozenal) have been popular because multiplication and division are easier than in base-10, with addition and subtracting being just as easy. 12 is a useful base because it has many factors. It is the smallest multiple of one through four and of six. There is still a special word for "dozen" and just like there is a word for 10^{2}, hundred, there is also a word for 12^{2}, gross. Base-12 could have originated from the number of knuckles in the four fingers of a hand excluding the thumb, which is used as a pointer in counting.
Twelve is a common British unit of measurement. There are twelve inches to a foot. Prior to 1971, in British currency, there were 12 pennies to a shilling. . English words for numbers are also 'base-12' in that there is a unique word for the numbers one through twelve, with 'thirteen' being the first word that was formed by combining numbers (three and ten).
There are 24 hours per day, usually counted till 12 until noon ( p.m.) and once again until midnight ( a.m.), often further divided per 6 hours in counting (for instance in Thailand) or as switches between using terms like 'night', 'morning', 'afternoon', and 'evening', whereas other languages use such terms with durations of 3 to 9 hours often according to switches at some of the 3 hour interval marks.
Multiples of 12 have been in common use as English units of resolution in the analog and digital printing world, where 1 point equals 1/72 of an inch and 12 points equal 1 pica, and printer resolutions like 360, 600, 720, 1200 or 1440 dpi (dots per inch) are common. These are combinations of base-12 and base-10 factors: (3×12)×10, 12×(5×10), (6×12)×10, 12×(10×10) and (12×12)×10.
Twenty
The Maya civilization and other civilizations of Pre-Columbian Mesoamerica used base-20 ( vigesimal), possibly originating from the number of a person's fingers and toes. Evidence of base-20 counting systems is also found in the languages of central and western Africa.
Possible remnants of a base-20 system also exist in French, as seen in the names of the numbers from 60 through 99. For example, sixty-five is soixante-cinq (literally, "sixty [and] five"), while seventy-five is soixante-quinze (literally, "sixty [and] fifteen"). Furthermore, for any number between 80 and 99, the "tens-column" number is expressed as a multiple of twenty (somewhat similar to the archaic English manner of speaking of " scores"). For example, eighty-two is quatre-vingt-deux (literally, four twenty[s] [and] two), while ninety-two is quatre-vingt-douze (literally, four twenty[s] [and] twelve).
The Irish language also used base-20 in the past, twenty being fichid, forty dhá fhichid, sixty trí fhichid and eighty ceithre fhichid. A remnant of this system may be seen in the modern word for 40, daoichead.
Danish numerals display a similar base-20 structure.
Sixty
Base 60 ( sexagesimal) was used by the Sumerians and their successors in Mesopotamia and survives today in our system of time (hence the division of an hour into 60 minutes and a minute into 60 seconds) and in our system of angular measure (a degree is divided into 60 minutes and a minute is divided into 60 seconds). 60 also has a large number of factors, including the first six counting numbers. Base-60 systems are believed to have originated through the merging of base-10 and base-12 systems. The Chinese Calendar, for example, uses a base-60 Jia-Zi甲子 system to denote years, with each year within the 60-year cycle being named with two symbols, the first being base-10 (called Tian-Gan天干 or heavenly stems) and the second symbol being base 12 (called Di-Zhi地支 or earthly branches). Both symbols are incremented in successive years until the first pattern recurs 60 years later. The second symbol of this system is also related to the 12-animal Chinese zodiac system. The Jia-zi system can also be applied to counting days, with a year containing roughly six 60-day cycles.
Dual base (five and twenty)
Many ancient counting systems use 5 as a primary base, almost surely coming from the number of fingers on a person's hand. Often these systems are supplemented with a secondary base, sometimes ten, sometimes twenty. In some African languages the word for 5 is the same as "hand" or "fist" ( Dyola language of Guinea-Bissau, Banda language of Central Africa). Counting continues by adding 1, 2, 3, or 4 to combinations of 5, until the secondary base is reached. In the case of twenty, this word often means "man complete". This system is referred to as quinquavigesimal. It is found in many languages of the Sudan region.
Base names
1 - unary 2 - binary 3 - ternary / trinary 4 - quaternary 5 - quinary / quinternary 6 - senary / heximal / hexary 7 - septenary / septuary 8 - octal / octonary / octonal / octimal 9 - nonary / novary / noval 10 - decimal / denary 11 - undecimal / undenary / unodecimal 12 - dozenal / duodecimal / duodenary 13 - tridecimal / tredecimal / triodecimal 14 - tetradecimal / quadrodecimal / quattuordecimal 15 - pentadecimal / quindecimal 16 - hexadecimal / sexadecimal / sedecimal 17 - septendecimal / heptadecimal 18 - octodecimal / decennoctal 19 - nonadecimal / novodecimal / decennoval 20 - vigesimal / bigesimal / bidecimal 21 - unovigesimal / unobigesimal 22 - duovigesimal 23 - triovigesimal 24 - quadrovigesimal / quadriovigesimal 26 - hexavigesimal / sexavigesimal 27 - heptovigesimal 28 - octovigesimal 29 - novovigesimal 30 - trigesimal / triogesimal 31 - unotrigesimal (...repeat naming pattern...) 36 - hexatridecimal / sexatrigesimal (...repeat naming pattern...) 40 - quadragesimal / quadrigesimal 41 - unoquadragesimal (...repeat naming pattern...) 50 - quinquagesimal / pentagesimal 51 - unoquinquagesimal (...repeat naming pattern...) 60 - sexagesimal (...repeat naming pattern...) 64 - quadrosexagesimal (...repeat naming pattern...) 70 - septagesimal / heptagesimal 80 - octagesimal / octogesimal 90 - nonagesimal / novagesimal 100 - centimal / centesimal (...repeat naming pattern...) 110 - decacentimal 111 - unodecacentimal (...repeat naming pattern...) 200 - bicentimal / bicentesimal (...repeat naming pattern...) 210 - decabicentimal 211 - unodecabicentimal (...repeat naming pattern...) 300 - tercentimal / tricentesimal 400 - quattrocentimal / quadricentesimal 500 - quincentimal / pentacentesimal 600 - hexacentimal / hexacentesimal 700 - heptacentimal / heptacentesimal 800 - octacentimal / octocentimal / octacentesimal / octocentesimal 900 - novacentimal / novacentesimal 1000 - millesimal 2000 - bimillesimal (...repeat naming pattern...) 10000 - decamillesimal
Positional systems in detail
In a positional base-b numeral system (with b a positive natural number known as the radix), b basic symbols (or digits) corresponding to the first b natural numbers including zero are used. To generate the rest of the numerals, the position of the symbol in the figure is used. The symbol in the last position has its own value, and as it moves to the left its value is multiplied by b.
For example, in the decimal system (base 10), the numeral 4327 means (4×10^{3}) + (3×10^{2}) + (2×10^{1}) + (7×10^{0}), noting that 10^{0} = 1.
In general, if b is the base, we write a number in the numeral system of base b by expressing it in the form a_{n}b^{n} + a_{n − 1}b^{n − 1} + a_{n − 2}b^{n − 2} + ... + a_{0}b^{0} and writing the enumerated digits a_{n}a_{n − 1}a_{n − 2} ... a_{0} in descending order. The digits are natural numbers between 0 and b − 1, inclusive.
If a text (such as this one) discusses multiple bases, and if ambiguity exists, the base (itself represented in base 10) is added in subscript to the right of the number, like this: number_{base}. Unless specified by context, numbers without subscript are considered to be decimal.
By using a dot to divide the digits into two groups, one can also write fractions in the positional system. For example, the base-2 numeral 10.11 denotes 1×2^{1} + 0×2^{0} + 1×2^{−1} + 1×2^{−2} = 2.75.
In general, numbers in the base b system are of the form:
The numbers b^{k} and b^{−k} are the weights of the corresponding digits. The position k is the logarithm of the corresponding weight w, that is k = log_{b}w = log_{b}b^{k}. The highest used position is close to the order of magnitude of the number.
The number of tally marks required in the unary numeral system for describing the weight would have been w. In the positional system the number of digits required to describe it is only k + 1 = log_{b}w + 1, for . E.g. to describe the weight 1000 then 4 digits are needed since log_{10}1000 + 1 = 3 + 1. The number of digits required to describe the position is log_{b}k + 1 = log_{b}log_{b}w + 1 (in positions 1, 10, 100... only for simplicity in the decimal example).
Position | 3 | 2 | 1 | 0 | -1 | -2 | ... |
---|---|---|---|---|---|---|---|
Weight | b^{3} | b^{2} | b^{1} | b^{0} | b ^{− 1} | b ^{− 2} | ... |
Digit | a_{3} | a_{2} | a_{1} | a_{0} | c_{1} | c_{2} | ... |
Decimal example weight | 1000 | 100 | 10 | 1 | 0.1 | 0.01 | ... |
Decimal example digit | 4 | 3 | 2 | 7 | 0 | 0 | ... |
Note that a number has a terminating or repeating expansion if and only if it is rational; this does not depend on the base. A number that terminates in one base may repeat in another (thus 0.3_{10} = 0.0100110011001..._{2}). An irrational number stays unperiodic (infinite amount of unrepeating digits) in all integral bases. Thus, for example in base 2, π = 3.1415926..._{10} can be written down as the unperiodic 11.001001000011111..._{2}.
If b = p is a prime number, one can define base-p numerals whose expansion to the left never stops; these are called the p-adic numbers.
Change of radix
A simple algorithm for converting integers between positive-integer radices is repeated division by the target radix; the remainders give the "digits" starting at the least significant. E.g., 1020304 base 10 into base 7:
1020304 / 7 = 145757 r 5 145757 / 7 = 20822 r 3 20822 / 7 = 2974 r 4 2974 / 7 = 424 r 6 424 / 7 = 60 r 4 60 / 7 = 8 r 4 8 / 7 = 1 r 1 1 / 7 = 0 r 1 => 11446435
E.g., 10110111 base 2 into base 5:
10110111 / 101 = 100100 r 11 (3) 100100 / 101 = 111 r 1 (1) 111 / 101 = 1 r 10 (2) 1 / 101 = 0 r 1 (1) => 1213
To convert a "decimal" fraction, do repeated multiplication, taking the protruding integer parts as the "digits". Unfortunately a terminating fraction in one base may not terminate in another. E.g., 0.1A4C base 16 into base 9:
0.1A4C × 9 = 0.ECAC 0.ECAC × 9 = 8.520C 0.520C × 9 = 2.E26C 0.E26C × 9 = 7.F5CC 0.F5CC × 9 = 8.A42C 0.A42C × 9 = 5.C58C => 0.082785...
Generalized variable-length integers
More general is using a notation (here written little-endian) like a_{0}a_{1}a_{2} for a_{0} + a_{1}b_{1} + a_{2}b_{1}b_{2}, etc.
This is used in punycode, one aspect of which is the representation of a sequence of non-negative integers of arbitrary size in the form of a sequence without delimiters, of "digits" from a collection of 36: a-z and 0-9, representing 0-25 and 26-35 respectively. A digit lower than a threshold value marks that it is the most-significant digit, hence the end of the number. The threshold value depends on the position in the number. For example, if the threshold value for the first digit is b (i.e. 1) then a (i.e. 0) marks the end of the number (it has just one digit), so in numbers of more than one digit the range is only b-9 (1-35), therefore the weight b_{1} is 35 instead of 36. Suppose the threshold values for the second and third digit are c (2), then the third digit has a weight 34 × 35 = 1190 and we have the following sequence:
a (0), ba (1), ca (2), .., 9a (35), bb (36), cb (37), .., 9b (70), bca (71), .., 99a (1260), bcb (1261), etc.
Note that unlike a regular base-35 numeral system, we have numbers like 9b where 9 and b each represent 35; yet the representation is unique because ac and aca are not allowed.
The flexibility in choosing threshold values allows optimization depending on the frequency of occurrence of numbers of various sizes.
The case with all threshold values equal to 1 corresponds to bijective numeration, where the zeros correspond to separators of numbers with digits which are nonzero.
Properties of numerical systems with integer bases
Numeral systems with base A, where A is a positive integer, possess the following properties:
- If A is even and A/2 is odd, all integral powers greater than zero of the number (A/2)+1 will contain (A/2)+1 as their last digit
- If both A and A/2 are even, then all integral powers greater than or equal to zero of the number (A/2)+1 will alternate between having (A/2)+1 and 1 as their last digit. (For odd powers it will be (A/2)+1, for even powers it will be 1)
Proof of the first property:
Define Then x is even, and all x^{p} for p greater than 0 must be even. The property is equivalent to
We first check the case for p=1
x is less than A, so the result is trivial. We then check for p=2:
Since , then for all even N:
Because x is even, then x(x − 1) is congruent to zero modulo A. Therefore:
Using induction, assuming that the property holds for p-1:
Since the case holds for p-1, then . Since
is a case of Equation 1, then . This leaves, for all p greater than 0,
Q.E.D.
Proof of the second property:
Define Then x is odd, and all x^{p} for p greater than or equal to 0 must be odd. The property is equivalent to
Since , then for all odd E:
The case is first checked for p=0:
This result is trivial
Next, for p=1:
This result is also trivial
Next, for p=2:
Because x is odd, then x(x-1) is a case of Equation 2,
Next, for p=3:
Because x^{2} is odd, x^{2}(x − 1) + x^{2} is a case of Equation 2,
Since ,
, so .
Using induction, assuming that the property holds for p-1:
If p is odd:
Since x^{p − 1}(x − 1) is a case of Equation (2), , so
If p is even:
Since x^{p − 1}(x − 1) is a case of Equation (2), .
, so
Q.E.D.