To Previous Chapter To Table of Contents To Bottom of Page To Next Chapter

CS421 - Advanced Data Structures and Algorithm Development - Chapter 6: NUMBERS

Arithmetic operations are not cheap if the numbers are "large". [need to measure an input's "size" by the number of digits needed to represent it {# of decimal digits vs. # of binary digits - bit-cost model}
How do we define the worst cost (and lower bounds) of a problem when the cost depends on divisibility properties of n? [e.g. how many times does two divide n?]

  1. Exponential Numbers [what is the best way to evaluate mn through a series of multiplications]
    • mn = 1 if n = 0; = m2[n/2] x mn mod 2 if n > 0
    • Algorithm 6.1 for computing basepower where power >= 0 is an integer
      POWER(base, power)
      if power = 0 then return 1
      else
          half = POWER(base, {power/2])
          half = half * half
          if power is odd then
              half = half * base
          return half
    • POWER finds mn in O(lg n) multiplications
      f(n) = 0 if n = 1; = f([n/2] + 1 + n mod 2 if n > 1 where
      f(n) = [lg n] + b(n) - 1 where b(n) is the number of ones in the binary representation of n and is no more than [lg n] + 1 => finding powers can be solved in a linear number of multiplications (linear in [(lg n) +1])
    • PAUSE: Why isn't f a function of n and m
    • So what's the smallest number of multiplications necessary to find the nth power? POWER finds m15 in six multiplications.
    • FACTOR ALGORITHM - factor n then look at all possible ways to comlete a multiplication from the combination of factors
    • Build addition chains [sequence of exponents starting with 1 such that every exponent in the sequence is the sum of two previous exponents in the sequence]
    • binary algorithm (POWER) is optimal for n where b(n) <= 3, but not where b(n) = 4.
    • factor algorithm is not optimal for n = 33
  2. Common Numbers
    • The largest common factor of two positive integers is the largest integer that divides both numbers exactly
    • unique factors theorem - every integer greater than 1 is uniquely expressible as a product of primes raised to integer powers
      [n = 2n13n25n37n4...]
    • largest common factor of n and m is 2min(n1, m1)3min(n2, m2)5min(n3, m3)...
    • PAUSE: How hard is factoring? Is there a cheaper way?
    • f(n,m) = n if m=0;
             m if n=0;
             1 if n or m = 1
             f(n-m, m) if n>= m>1
             f(n, m-n) if m>= n>1
    • quicker to divide: for all n,m there exists k,l such that n = km + l where m > l >= 0 [k is the quotient and l is the remainder of the division] : n = [n/m]m + n mod m
      f(n, m) = f(m, l) = f(m, n mod m)
    • 		LARGEST COMMON FACTOR(n, m)
      {find the largest common factor of n and m,	n >= m >=  0. }
      		if m = 0 then 
      			return n 
      		else 
      			return LARGEST COMMON FACTOR(m, n mod m) 
      		
    • How long does this algorithm take? [if n >= m, how big is n mod m relative to n?] {Show that n/2 > n mod m} => stop after 2 lg n iterations
    • Worst case is when m and n are consecutive Fibonacci numbers and the worst number of iterations is [ logf Ö5n] -2 wheref is 1.618... is the golden ratio.
  3. Prime Numbers
    • Sieve of Eratosthenes: divide n by all numbers up to [Ön]
    • PAUSE: Assume that it takes k microseconds to divide two k-digit numbers. How long will the sieve take to find the 1,065 digit prime in fig. 6.5? - sieve uses up to [Ön] - 1 divisions, so in the bit-cost model, worst cost is exponential in length of the input ([lg(n+1)]
    • Find the number of times two divides n
    • If n is in trinary, powers of three factors are easy to find. How hard are they to find if n is in binary?
    1. Number Theory
      if n divides a - b then a º b (mod n) [i.e. a and b have the same remainder when divided by n]
      if n is an odd prime and m and a are less than n, then a is a modular square root of m modulo n if
      m º a2 (mod n) - whenever such an a exists, m is a modular square modulo n. [Every modular square has at least two modular square roots.]
      For every odd prime n, exactly half the numbers from 1 to n - 1 are modular squares modulo n.

      If m º ab (mod n) then b is the base a modular logarithm of m modulo n.
      MODULAR_POWER (base, power, modulus)
      	{Compute basepower mod modulus.
      		power >= 0 is an integer; modulus >= 1 is an integer}
      	
      	if power > 0 then
      		return 1
      	else
      		half = MODULAR_POWER(base, [power/2], modulus)
      		half = half * half
      		if power is odd then
      			half = half * base
      		return half mod modulus
      
      Two numbers are coprime if their only common factor is 1.
      j(n) = number of positive integers less than, and coprime to, n [totient function].
      Euler-Fermat theorem = for every coprime n and m,
      mj(n) º 1 (mod n)
      If n is prime, then for all m ¹ n
      mn-1 º 1 (mod n) [Fermat's Theorem] [i.e. if n is prime then for all smaller m, mn-1 is one more than a multiple of n.]

      Can we use Fermat's theorem to test for primes? Carmichael number n is composite, yet for every m < n, mn-1 is one more than a multiple of n. [561=3x11x17]

      The Chinese remainder theorem states that if n1, n2, ... , nk are all pairwise coprime, then the k linear congruences
      m º a1 (mod n1), m º a2 (mod n2), ... , m º ak (mod nk) are simultaneously solvable and solution is unique modulo n1 x n2 x...x nk.

      The prime number theorem states that the number of primes less than n grows like n / lg n


    2. Finding Primes Probablistically
      • How do we know that 267 - 1 = 193707721 x 761838257287?
      • Prime # thm. tells us that primes are distributed as 1 / ln n
      • Can we pick a random # m between 2 and [Ön] and try dividing into n?
      • Should we pick an m <: n at random and test whether m is coprime?
      • Generate many random numbers m <: n and test whether mn-1 is 1 mod n.
      • Given n and m, let k = (n -1)/2n(n-1), where n(n) is the number of times two divides n. Then n is a base m psedudoprime if
        mk º 1 (mod n) or if there is an i between 0 and n(n-1) - 1 such that
        mk2i º -1 (mod n)
        ex. let n =37, and m =3, 7
        ex. let n =33
        
        PSEUDOPRIME( n)
        	{ Look for a witness to n's compositeness.
        		n(n) is the number of times two divides n.
        		Return composite	if n is found to be composite,
        		otherwise return pseudoprime.
        		n >= 3 is odd. }
        		
        	guess = uniform(2, n - 1)
        	if MODULAR_POWER(guess, n(n-1), n) = 1
        		return pseudoprime
        	
        	test = (n-1)/2n(n-1)
        	for i = 0 to n(n-1) -1
        		power = test * 2i
        		if n divides MODULAR_POWER(guess, power, n) + 1
        			return pseudoprime
        	return composite
        
        
  4. Secret Numbers
    How do we transport information in a network whose nodes are information producers and consumers and whose edges are information transporters?
    PAUSE: When in a bank do you trust someone claiming to be a teller? When logging on do you trust the program claiming to be the login program?
    authentication problem - each node distrusts each other
    traditional secrecy problem - three nodes where two nodes trust each other and both distructs the third (channel)
    cryptology - systems that keep secrets
    cryptography - making secure systems
    cryptoanalysis - breaking secure systems

    Data Encryption - uses an algorithm which hides the meaning of the text
    plaintext _______ ciphertext ________ original
    ---------------> |_______| -------------->|________| ----------->
    plaintext

    Why do we need encryption? ---- TRUST

    A good cryptographic algorithm:

    1. Should be simple to use by authorized users
    2. Should be difficult and time consuming for non-authorized users to decrypt.
    3. The security of the data should not depend on the secrecy of the algorithm.
    4. The efficiency and security of the algorithm should not be data dependent.
    Choose a function fc to encrypt and a function fd to decrypt where c and d are the keys [we want functions such that for all m in the message space: fd(fc(m)) = m

    Encryption Algorithms:

    1. substitution
      • Caesar Cipher - monoalphabetic translation
      • Vernam Cipher - uses one-time pad developed using random numbers
    2. transpositions (permutations) - used by Spartans
      • Columnar Transpositions
      • Double Transpositons
    3. One-time pads
    4. Most systems are only computationally secure
    • Secret Key Systems (need secret but common keys)
      • The unicity of a system is the average length of a ciphertext such that it is the image of only oone plaintext/key pair.
      • monoalphabetic - cryptosystem that always substitutes the same ciphertext for each plaintext letter [ any ciphertext longer than about 30 letters in a monoalphabetic system can be at most one message] - easy to break
      • Data Encyption Standard (DES) - uses 16 cycles of substitution and transposition of 64 bit blocks (uses a 56-bit key) [can be broken by brute force) [contains a trapdoor] fast
    • Public Key Systems - uses one-way functions, such that
      • fd(fc(m)) = m,
      • fc, fd are both easy to compute,
      • knowing c does not help to find d.

      some one-way functions include: computing factors, modular square roots, and modular logarithms; computationally hard to determine the private key from the public key
      the composition of two functions f and g is the function f g where for all x,
      f(g(x)) = y iff there exists a z such that g(x) = z and f(z) = y
      two functions f and g commute if for all x in their common domain
      f(g(x)) = g(f(x)) [fig. 6.9]
      Diffie-Hellman key exchange: Alice & Bob each agree on two large integers n and m, where n > m. Alice chooses a large integer a, computes ma mod n and sends this to Bob. Bob chooses a large integer b, computes mb mod n, and sends it to Alice. Alice computes (mb mod n)a mod n. and Bob computes (ma mod n)b mod n. These are equal and become the common secret key. Knowing n, m, ma mod n, mb mod n does not yield a or b.
    • A Factoring System - RSA (Rivest-Shamir-Adleman) = public key system, slower than DES
      Alice chooses two large non-equal primes, k and l and computes n = kl. Then she finds a large integer a, coprime to j(n), the totient of n, and computers b, the inverse of a mod j(n). [ab º 1 (mod j(n))] therefore, there is some i such that ab = ij(n) + 1. Alice's public key is (n, b); her private key is a.
      To send Alice a secret message, Bob converts his message to a sequence of numbers, each less than n. Let one of these numbers be m, compute mb mod m and transmit the ciphertest (c) to Alice. Alice computes ca mod n and recovers m since
      ca = (mb(mod n))a
      = mab mod n
      = mij(n) + 1 mod n
      = mij(n)*m mod n
      º m
    • A Knapsack System: Given n tools each weighing an integer number of kilograms, and a knapsack that can carry m kilograms, is there a subset of tools that will exactly fill the knapsack? [NP-complete problem]
      Merkle-Hellman: Alice publishes a vector of integers, A. To send a message m to Alice, convert m to a binary vector, B, and send Alice A . B = a1b1 + a2b2 + ... + anbn [inner product of A and B]
      Letting A be a superincreasing sequence allows Alice to easily determine B [can be easily broken].
      Mask an A' by transforming it to A: ai º l a'i mod k where l and k are large coprime integers
  5. Authentic Numbers
    authentication problem - want to verify that a message came from the alleged sender, went to the intended receiver, was not changed, and was received in the same sequence that it was sent (maintaining integrity)
      Attacks warranting authentication
    1. Disclosure - release of message contents to any person / process not possessing key
    2. Traffic Analysis - Discovery of the pattern of traffic between parties
    3. Masquerade - insertion of messages into the network from a fradulent source
    4. Content modification - changes to the contents of a message
    5. Sequence modification
    6. Timing modification - delay or replay of messages
    7. Repudiation - denial of receipt of message by destination
    • two-way trust: against an attacker
      send fc("fd(m) bob") where fd is Bob's private key and fc is Alice's public key and m is the message from Bob to Alice
    • one-way trust: user and computer, databse, bank, etc.
      cash dispenser problem - requires time-stamping message acknowledgement and sending two time-dependendent authentications with each message
    • no trust: between governments, banks, companies, etc.
      Authentication Functions
    • Message encryption - ciphertext of the entire message serves as its authenticator
    • Message authentication code (MAC) - public function of the message and a secret key that produces a fixed-length value that serves as the authenticator (cryptographic checksum - one-way function)
      COMPUTER DATA AUTHENTICATION - uses DES
    • Hash function - a public function that maps a message of any length into a fixed-length hash value, which serves as the authenticator (one-way function)
      ex. bit-by-bit exclusinve-OR of every block (longitudinal redunancy check)
      ex. MD5 (message digest) - creates a 128-bit message digest using 512-bit blocks of input
    • Public Authentication
      remote cash dispenser problem: Customer of bank 1 wants to use an ATM machine of bank 2's; if message authenticator is formed with bank 2's key, it must be replaced with a key known to both banks or part of the message must be in plaintext
      anonymous credit problem instance of privacy problem - Alice uses a smartcard to prove she is a valid customer and has enough money
      treaty compliance problem - installing monitors
    • Covert Channels - send secret messages hidden in regular message [use parity bits or checksums to indicate which message is authentic]
      detecting a covert channel can be made as hard as factoring, i.e. Let n = k*l*m where k, l, m are large primes and
      k º l º 3 (mod 4), m º 5 (mod 8)
      Any modular square a2 mod n, where a2 is coprime to n has eight modular square roots (grouped into 4 pairs). Alice and Bob agree on n, and know its factors and can then quickly compute the modular square roots of a transmitted a2. A message can be sent as a2 with one of its modular square roots
    • Zero-Knowledge Proofs - want to prove something without giving away any details of the proof.
      possible to probablistically prove that a theorem was provable without giving the proof. [claim to be able to solve any cubic equation and offers as proof any number of verifications] Requires one-way functions.
    • Digital Signatures
      chromatic number of a graph is the smallest number of colors needed to color its nodes so that no two neighboring nodes have the same color.
      • generate a random 3-colorable graph [grow a random graph that is always 3-colorable] with as many nodes as necessary to ensure security
      • give the authenticator the graph and an encrypted form of its coloring
      • authenticator displays graph (and we secretly color the graph and cover up the node colors)
      • Let authenticator pick a pair of nodes and tell the authenticator the colors of the chosen nodes
      • Give the authenticator the decrypt key for verification and randomly permute the colors of all nodes and repeat
  6. Random Numbers
      Which sequences are random?
    • 1,1,1,1,1,1,...
    • ,1,2,3,4,5,6,7,8,9,...
    • 2,7,1,8,2,8,1,8,2,...
    • 2,3,0,2,5,8,5,0,9,... [ln 10]
    • 0,5,8,8,2,3,5,2,9,... [1/17]
    • 3,3,5,4,4,3,5,5,4,... [# of letters in English names of numbers]
    • 0.1234567891011121314... [Champernowne's number]
    • uniform distribution - distribution of numbers in the sequence should be uniform.
    • independence - no value in the sequence can be inferred from the others [cannnot "prove" independence - use statistically random]

    • Sources of Random Numbers - physical noise generators, tables of published random numbers, algorithmic techniques
    • Pseudorandom Number Generators -
      linear congruential method: Xn+1 = (aXn + c) mod m where m is a large prime, and a, c, X0 are less than m
      let a = 75, c = 0, and m = 231 - 1 and use a "clock" to select X0
    • Cryptographically Generated Random Numbers
      • Cyclic Encryption - generates session keys from master key
      • DES Output Feedback Mode
      • ANSI X9.17 Pseudorandom Number Generator (used by PGP)
    • Blum Blum Shub Generator
      Choose two large prime numbers, p and q, such that p º q º 3 (mod 4); let n = p* q and choose a random number s that is relatively prime to n, Then the BBS generator is:
      X0 = s2 mod n
      for i = 1 to infinity
      	Xi = (xi-1)2 mod n
      	Bi = Xi mod 2
      
      The number B is comprised of the least significant bits from each iteration
  7. Coda--Straight On Till Morning

To Previous Chapter To Table of Contents To top of page To Next Chapter