TI Advanced Scientific Computer

From Infogalactic: the planetary knowledge core
Jump to: navigation, search

The Advanced Scientific Computer, or ASC, was a supercomputer architecture designed by Texas Instruments (TI) between 1966 and 1973. Key to the ASC's design was a single high-speed shared memory, which was accessed by a number of processors and channel controllers, in a fashion similar to Seymour Cray's groundbreaking CDC 6600. Whereas the 6600 featured ten smaller computers feeding a single math unit (ALU), in the ASC this was simplified into a single 8-core processor feeding the ALU. The 4-core ALU/CPU was one of the first to include dedicated vector processing instructions, with the ability to send the same instruction to all four cores.

History

TI had begun as a division of Geophysical Service Incorporated (GSI), a company that performed seismic surveys for oil exploration companies. GSI was now a subsidiary of TI, and TI wanted to apply the latest computer technology to the processing and analysis of seismic datasets. The ASC project started as the Advanced Seismic Computer. As the project developed, TI decided to expand its scope. "Seismic" was replaced by "Scientific" in the name, allowing the project to retain the designation ASC.

Originally the software, including an operating system and a FORTRAN compiler, were done under contract by Computer Usage Company, under direction of George R. Trimble, Jr.[1][2] but later taken over by TI itself. Southern Methodist University in Dallas developed an ALGOL compiler for the ASC.

Architecture

Memory was accessed solely under the control of the memory control unit, or MCU. The MCU was a two-way, 256-bit/channel parallel network that could support up to eight independent processors, with a ninth channel for accessing "main memory" (or "extended memory" as they referred to it). The MCU also acted as a cache controller, offering high speed access on the eight processor ports to a semiconductor-based memory, and handling all communications to the 24-bit address space in main memory. The MCU was designed to operate asynchronously, allowing it to work at a variety of speeds and scale across a number of performance points. For instance, main memory could be constructed out of slower but less expensive core memory, although this was not used in practice. At the fastest, it could sustain transfer rates of 80 million 32-bit words per second per port, for a total transfer capacity of 640M-words/sec. This was well beyond the capabilities of even the fastest memories of the era.

The main ALU/CPU was extremely advanced for its era. The design included four basic cores that could be combined to handle vector instructions. Each core included a complete instruction pipeline system that could keep up to twelve scalar instructions in-flight at the same time, allowing up to 36 instructions in total across the entire CPU. From one to four vector results could be produced every 60ns, the basic cycle time (about 16 MHz), depending on the number of execution units provided. Implementations of this sort of parallel/pipelined instruction system did not appear on modern commodity processors until the late 1990s, and vector instructions (now known as SIMD) until a few years later.

The processor included 48 32-bit registers, a huge number for the time, although they were not general purpose as they are in modern designs. Sixteen were used for addresses, another sixteen for math, eight for index offsets and another eight for vector instructions. Registers were accessed externally using a RISC-like load/store system, with instructions to load anything from 4-bits to 64-bit (two registers) at a time.

Most vector machines tended to be memory-limited, that is, they could process data faster than they could get it from memory. This remains a major problem on modern SIMD designs as well, which is why considerable effort has been put into increasing memory throughput in modern computer designs (although largely unsuccessfully). In the ASC this was improved somewhat with a lookahead unit that predicted upcoming memory accesses and loaded them into the ALU registers invisibility, using a memory interface in the CPU known as the memory buffer unit (MBU).

The "Peripheral Processor" was a separate system dedicated entirely to quickly running the operating system and programs running within it, as well as feeding data to the main CPU. The PP was built out of eight "virtual processors", VP's, which were designed to handle instructions and basic integer math only. Each VP included its own program counter and registers, and the system could thus run eight programs at the same time, limited by memory accesses. Keeping eight programs running allowed the system to shuffle execution of programs on the main CPU depending on what data was available on the memory bus at that time, attempting to avoid "dead time" when the CPU was waiting on memory. This technique has also made its appearance in modern CPU's, where it is known as simultaneous multithreading or, according to Intel, HyperThreading.

The PP also included a set of sixty-four 32-bit registers known as the communications register (CR). The CR put the "Peripheral" in the PP, and was the main storage system for state information between the various parts of the ASC; the CPU, VPs, and channel controllers.

The ASC instruction set include a "bit-reverse" instruction that was intended to speed up the calculation of fast Fourier transforms. By the time the ASC was in production better FFT algorithms were developed that did not require this operation. TI offered a bounty to the first person to come up with a valid use for the bit reverse instruction. The bounty was never collected.

Technological success, business failure

When ASC machines first became available in the early 1970s they outperformed almost all other machines, including the CDC STAR-100, and under certain conditions matched that of the infamous one-off ILLIAC IV. However only seven had been installed when the famous Cray-1 was announced in 1975. The Cray-1 dedicated almost all of its design to sustained high-speed access to memory,[clarification needed][citation needed] including over one million 64-bit words of semiconductor memory and a cycle time that was one-fifth that of the ASC (12.5 ns). Although the ASC was in some ways a more expandable design, in the supercomputer world outright speed wins,[clarification needed] and the Cray-1 was simply much faster. ASC sales ended almost overnight, and although an upgraded ASC had been designed with a cycle time one-fifth that of the original, Texas Instruments decided to exit the market entirely.

Vector Processing Applications

The ASC #1 prototype was a one pipe system and brought up in Austin, Texas, off site from TI's main plant for proprietary information reasons. It was later upgraded to two pipes and renamed as ASC # 1A. It was then used by TI's GSI division for seismic data processing. ASC #2 was leased to Shell Oil Company in Holland and also used for seismic data processing. ASC #3 was installed at the Redstone Arsenal in Huntsville, Alabama for Anti Ballistic Missile Interception technology development. With the Salt Treaty, the system was later redeployed to the Army Corp of Engineers in Vicksburg Mississippi for dam stress analysis. ASC #4 was used by NOAA at Princeton University for developing weather forecasting models. ASC systems #5 and #6 were installed at TI's main plant in Austin and also used by GSI for seismic data processing. ASC #7 went to the Naval Research Lab in Maryland for plasma physics studies.

External links

References

  1. Lua error in package.lua at line 80: module 'strict' not found.
  2. Lua error in package.lua at line 80: module 'strict' not found.
  • Peter M. Kogge (1981). The Architecture of Pipelined Computers. Taylor & Francis. pp. 159–162.
  • Larry A Rickert (1970-1983) ASC Prototype Technician, Field Support, & Program Controller