Submit Your Assignment
Name * :
Email * :
Phone   :  
Country *   :  
Deadline   :  
Offered Price   :  
Message   :  
Assignment   :  
Assignment   :  
Assignment   :  
Assignment   :  
Captcha   :  
All marked (*) are mandatory

“ I had a lot of difficulty with my computer programming assignment and with 24 hours left I was freaking out. Thanks for rescuing me on time. You saved me from repeating a college year ”Tom Smith, USA

Assembly Language Assignment Help | Assembly Language Programming Help | Assembly Language Online Tutor

If you need Assembly Program Assignment Help / Assembly Program Homework Help, Assembly Language Project Help or having problem with your Assembly Language Coursework then we have the solutions you need. We can help you with any Assembly Language Problem you may have. Just upload your Assembly Language Assignment/Assembly Language Homework or Assembly Language Project at our website or email it to Mr. Neil Harding with his team of Assembly Language Tutors would go through your requirements and revert at the earliest. We provide quick and instant Assembly Solutions.

You can schedule an Online Tutoring session with one of our Assembly Tutors by discussing on our live chat window.

There are multiple processors used and they are all different (which is one of the reasons that languages such as C made portable code easier). Assembly language is programming close to the metal, and has features that are not portable to high level languages (since they vary between the different processors). There are 2 main types of processor, the RISC (reduced instruction set computers), and CISC (complex instruction set computers). RISC processors have less instructions, but tend to execute them faster compared to the CISC processors which have a wider range of instructions.

When programming in machine code, most processors use registers (there are some stack based processors although these are not very common). They normally have a small number of registers that you can use, which are much faster to use than memory. There are processors such as the 6502, and the Z80 which both used to be very common and are 8 bit processors (although the Z80 has some operations that work on a pair of registers as a 16 bit value). In addition most processors have a status register which is set by the instructions, typically it has flags such as zero (the last operation resulted in a zero value), negative, overflow and carry (these 2 flags can be set, after an addition instruction results in the value being too large to be stored in the register). The 8 bit processors normally had 16 bit address bus, which meant they were limited to 64K of RAM, although at the time they often had less RAM than that. They would run between 1-4 Mhz. Other processors you may encounter in classes include Motorola 68000, which was a 16/32 bit processor, ARM, and the 8086 family (80386,80486,Pentium,Athlon,etc), SPARC and MIPS which are 32 or 64 bit processors. One other processor you may encounter is the Arduino which is used as a low cost microcontroller for electronic projects. You use an assembler to convert opcodes into a binary format which will run on the device or under an emulator, there are cross assemblers which are used to compile for a processor that you are not running the assembler on (so to compile ARM code on the PC you would be using a cross assembler).

The 6502 is one of the simplest processors in use, as it only has 3 registers (A,X,Y). The A register, is the accumulator and is used as a general purpose register. All arithmetic operations work on the accumulator. You can access memory indexed with the X and Y registers. It has a stack of 256 values, and a zero page (the first 256 bytes of memory) which can be accessed with a shorter instruction that takes less time. You use instructions such as LDA which loads the accumulator which a value from memory, STA which stores the accumulator. You have branch instructions such as BEQ (branch if the zero flag is set), BNE (branch if the zero flag is not set), BCC and BCS (which branch depending on the carry flag), BMI and BPL (which branch based on the negative flag). You can't set the condition code flag can't be set directly, although you can use CLC, and SEC to clear or set the carry flag. JSR which is used to call a subroutine and RTS which returns. TAX, TXA, TAY, TYA are used to transfer between the accumulator and the X & Y registers. There are instructions for adding (ADC), subtracting (SBC), shifting (ASL, LSR), rotating (ROR, ROL), as well as the boolean instructions (AND, ORA, NOT, XOR). There is also a decimal mode for BCD (binary coded decimal) artithmetic, which simplifies the logic on dealing with numbers you need to display (each 4 bits of of a number is treated as 0-9 with the carry automatically going the half bytes). This means you can extract the number with a simple mask, rather than having to divide by 10.
The Zilog Z80 processor was used in Sinclair's home computers (Timex in America). It was also used to implement CP/M which was the basis of the original MS-Dos. I'm not aware of any courses that are using Z80, but if there are any please let us know, but even so we can handle the assignments if there are any.
The AVR microprocessor is normally programmed in C, but for higher performance or more accurate timing assembly language is useful. The Arduino is a reasonably low cost microcontroller that offers an open source platform with shields providing all types of input / output controls.
The 68000 family are 32 bit processors (although the 68000 only had a 16 bit data bus, and a 24 bit address bus). It has 16 registers, 8 data registers (D0-D7) and 8 address registers (A0-A7). You can access the data registers as either a byte, word, of long (8,16,32 bit value) and they can be used orthogonally (you can use any data register when a data register is required). Accessing data registers can set the condition code flags, whilst accessing the address registers does not set the condition code flags. You can access memory using the address registers, with post increment or pre decrement on the address. It is a big endian processor unlike the 8086 series (the most significant byte comes first in memory). It was used in the Atari ST and Commodore Amiga computers as well as the Apple Macintosh and ran around 8Mhz, and supported 16MB of RAM (although they typically came with 512KB of RAM). It supports the basic mathematical operations, including support for multiplication and division (signed and unsigned).
            The ARM processor uses a RISC model, with all instructions being 32 bits in size. The ARM
processor has a source 1, source 2, destination instruction set. So you can do ADD R0,R1,R2 to do
R0 = R1 + R2. Unlike the other processors mentioned so far, it has some other nice features. Most
instructions can be executed conditionally, so to do an abs function you can just do RSBMI a, #0
which means if the minus flag is set, then subtract a from 0. The ARM processors are very efficient
with low power usage which means they are still used a lot in current machines (most Mobile phones,
the Nintendo DS and even the Windows 8 notebooks). Since all the instructions have to fit into 32
bits, it means that you can't load any value you want into a register since there isn't enough room. The assembler will construct a pair of instructions if needed to load a constant value. When loading a constant it has a shift and a value, so if the values 0-255 are efficiently represented, but so is 0x8000. If you need to load the constant 0xffff, it may load (0xff << 8) and then or with (0xff). The ARM processor also supports an additional mode, called thumb mode which uses 16 bit instructions. It takes more instructions to run the code, but since they are smaller it can reduce the code size by around 30%.
Both of these are used in current courses, normally using an emulator such as MARS or SPIM for MIPS programming. Both of them are RISC processors which use the 3 operand format of instructions.
Some courses still teach programming under DOS mode using 8086 instructions which means you have to use segment registers to access more than 64K. The 8086 has segment registers for code, data, stack. It also has 4 registers AX,BX,CX,DX which are general purpose although some instructions will only work with certain registers. You can also access the high or low bytes of each of these registers with AL, AH. It also has index registers SP (stack pointer), BP (base pointer normally used to access arguments passed on the stack), DI (destination index), SI (source index). From the 386 processor onwards, it was expanded to 32 bits, with EAX to EDX and the segments expanded to allow for so called flat mode (4GB segments). The processor is a little endian and is a CISC design. It has instructions for string handling (LODSB & MOVSB for byte orientated strings and LODSW & MOVSW for word orientated strings). You can use a REP instruction in front of certain instructions to repeat the instruction (in front of the string instructions). With the introduction of the MMX processor, additional registers were added that allow you to operate as SIMD (single instruction multiple data) where you can do multiple operations at the same time with interleaved data. There is also a 64 bit variant of the 8086 which adds RAX-RDX. The 8086 supported a floating point coprocessor, but that became an integrated part of the processor with the Pentium processor.
This is a rough outline of the topics involved in assembly language programming, I've tried to be reasonably comprehensive and so there will be sections that are not appropriate to some microprocessors, for example the MMU (Memory Management Unit) is not typically found on 8 bit microprocessors, and the FPU (Floating Point Unit) may even be missing on current processors such as certain configurations of the ARM processor.
Inline assembly language and call assembly language routines from C/C++. Inline assembly language allows you to mix C code and assembly language in the same function it is easy to use but is not portable between different compilers. If you write a routine in assembly language, you can link it with C compiler and call it as though it were a regular function (although you may need to specify some options in the C code to determine how parameters are passed, and the external name).You can also write an entire in assembly language in which case you don't need to worry about how it will interface. The sample code given below is for a generic processor with registers with single letter names, that uses OPERAND SOURCE, DESTINATION syntax, you may find other formats in use, such as OPERAND DESTINATION, SOURCE [, SOURCE2] depending on the actual processor you are writing code for. Whilst learning assembly language there are a few tools that can help you, there are disassemblers which take an executable file and convert it into assembly language, and debuggers that allow you to execute machine code step by step to see what happens. A lot of compilers also offer the option of generating assembly language from a high level language, which can be a useful way of generating a skeleton set of code.
Since you are working at a very low level, it is common to represent values as hexidecimal or binary values. Binary values are useful when working with masks, since you can see the bitpatterns involved, hexidecimal values make it easier to view the contents of memory since they are the same length (8 hexidecimal digits for each 32 bit word for example).
Almost all processors are register based (there are some that are stack based but they are not in the
mainstream). Most processors have a limited number of registers (between 3 & 32) and they have specific roles. The typical roles used by registers are address registers (including the stack pointer, and program counter), segment registers (80x86 family in particular, the segment registers define how the physical memory is mapped to a logical address and also determines the access rights), status (or condition code) register which keeps track of flags such as zero flag, negative flag, parity flag, overflow flag, carry flag, the interrupt level, supervisor mode, and possible others. There may also be index registers (although these tend to have been replaced with just regular registers now), and general arithmetic registers. You can normally use a value from a register, an immediate value, or by accessing memory either from an absolute location, an indirect address (similar to using a pointer in C/C++), an indirect address with an offset (either fixed, or provided by a register). A register is fast to access, and is generally what determines the size of the processor (so an 8 bit processor means 8 but registers, a 32 bit processor has 32 bit registers, and so on). There may also be floating point registers which have some bits for exponent, mantissa and a sign bit, there can also be some registers which are used as vector registers, so there can be multiple values inside a single register (for example the MMX registers). Some processors allow you to treat a pair of registers as a single value, or to access part of a the register as though it were the full value.
Since artithmetic operations involve the registers, you can perform simple arithmetic operations such as addition and subtraction (some processors also have multipication and division). You use the carry flag if you need to handle numbers that won't fit inside a register. So for adding AB + CD you would do CLC; ADD B,D; ADD A,C. Some processors also have a BCD (Binary Coded Decimal) mode, which puts a single decimal digit in each 4 bits, so you add the numbers together, and it may carry between each 4 bit value. Also there can be floating point arithmetic instructions, which treat the values as a floating point number. You can get underflow values (where the number is too small to represent correctly) as well as infinity values, and NaN (not a number value) for operations such as divide by 0.
Assembly language provides operands for and, or, xor (exclusive or), not, shift and rotate, there may also by bit operations such as set, clear or test an individual bit.
Most processors don't have conditional instructions (the ARM family does, and so you can write code slightly differently, although you still need to know about inverting the condition). You do a check, and then branch on the results of the condition code register, so if you want to write if (a < 0) a = b; you actually code it as if ( a >= 0) goto skip; a = b; .skip: CMP A          ;Just loading A on some processors will set the condition flag, so this instruction may not;be needed
            BMI .SKIP
            MOVE B,A
Normally with loops, you tend to write them in the reverse order you would do in a high level language. Here with have an example of a for loop (for (int a = 0; a < 10; a++) although we have reversed it to for (int a = 10; a != 0; a--). The reason for reversing the loop is that we can avoid the CMP #10,A if we changed it
to INC A instead of DEC A.

            MOVE #10,A     ;We want to do this 10 times
            ADD B,C          ;Add B to C
            DEC A              ;A = A - 1
            BNE .LOOP      ;If A is not 0 then goto loop
A whle loop is even simpler, we just do
            JMP .WHILETEST
            CMP #17,A
            BNE .WHILELOOP
Since assembly language is more verbose than higher level languages, it is common to use macros toperform common operations. You may have a macro to display a string for example, which might look like this. MACRO DisplayString MOVE /1,B; MOVE #PRINT_STRING,A; SYSCALL A; ENDMACRO which you would use as DisplayString "Assembly Language Is Easy", it may need to be more complex depending on the assembler, perhaps you might need to put the string into the data segment, so you could have a macro
MACRO DisplayString DATA  @StrData: DCB /1,0 CODE MOVE #@StrData,B; MOVE #PRINT_STRING,A;
SYSCALL A; ENDMACRO, so as you can see a macro can simplify the differences between assemblers, and allow for code that can be more easily ported and easier to read as well.
You can specify a value as an immediate value (typically with a # sign, so MOVE #10,A). As an absolute value MOVE NUM_ELEMENTS,A. From a register MOVE B,A, or from an indirect register MOVE (B),A equivalent to A = *B. Indirect register with post increment MOVE (B)+,A equivalent A = *B++. Indirect register with pre decrement MOVE -(B),A equivalent A = *--B. Indirect with offset MOVE (B,#4),A
which is used for accessing structures, and indirect with register offset MOVE (B,C),A which is used for accessing arrays A = B[C]. Some processors allow for MOVE (B,C*4),A type operands (where the *4 may be replaced by 2,4 or 8)
There are 2 main ways of passing arguments to subroutines, you can either use the stack (which
is commonly the way arguements are passed from C/C++), or you can use the registers. The advantage of using registers is that it is faster but means that the code can be more complex since you need to preserve the registers at the calling site, and you are normally limited to a certain number of arguments before you need to use the stack anyway. When you use the stack to pass arguements, it automatically allows you to have recursive calls and also reserve space for local variables. When you call a subroutine the address of the program counter goes on the stack and when you return, it sets the program counter to equal the value on the top of the stack and increments the stack pointer. IF you know that a routine you are calling is the last thing in the routine you are currently in, then you can jump to the subroutine with a normal jump command rather than a jump to subroutine command, and then when it returns it will return back to the the previous routine this is called tail end recursion and is a useful optimization. To access the arguements passed on the stack you would have code something like the following.
.SETCOLOR:     ;Set the color to R,G,B
            MOVE SP,S
            MOVE [S,#-3],R
            MOVE [S,#-2],G
            MOVE [S,#-1],B
            .... Rest of the code     
Most hardware can be accessed using memory mapped addresses, although on some processors it may be done by accessing a port using a special instruction IN, or OUT for example. Since you are operating at the level device drivers operate at, you need to make sure you don't exceed the capabilities of the device, so you may need to insert wait statements after accessing the hardware before checking the result or writing to the hardware again.
Interrupts are handled in a similar way to subroutines, but you need to make sure that you don't change the state of any of the registers otherwise it may lead to unpredictable behavior in the rest of the program. Normally when entering an interrupt, further interrupts are disabled until the processing is complete so you should ensure that the interrupt routines execute as fast as possible. You should also avoid calling system functions during an interrupt such as writing to disk, and many operating system routines are not designed to be reentrant.
The MMU handles memory access, it converts logical addresses to physical addresses and also controls access. Virtual memory is also handled using the MMU, if the memory has been allocated but is not in the loaded pages then a page fault is issued to allow the operating system the ability to load the page from disk which is how virtual memory is handled. The MMU can restrict access to memory depending on the current level of the application (so the kernel can access all the memory, but user programs can only access memory to which they have the rights), and sections of memory can be marked as read only, or that the memory is not able to contain executable data.
There is a game coming out called 0x10c by the creator of Minecraft. You can write machine code programs that run on the in game processor to do tasks in the game. It may be a fun way of learning to write machine code programs (it is an emulated 16 bit processor). The game can be found at
            Other resources.
8086 assembly resources help.
                        A good reference to assemblers can be found at
            Instruction sets.
            Memory map.
            System calls.
            Other Resources.
            Instruction sets.
            System calls.
            Other resources.
            Instruction sets.
            Instruction sets.
            System calls.
            Instruction sets.
            Memory map.
            Other resources.
            Instruction sets.
            System calls.
            Other resources.


The following C++ is given as a structural specification of the MIPS program that you will write. You must follow the key data structure (use an array of floating point numbers) and program structure (use all of the procedures defined in the program) to implement your MIPS program. The difference between this program and the program in question is of using floating point number in the array and use floating point operations.

float static nArray[50];   //declare an array of 50

void static setArray()     //Initialize the array
            for (int i = 0; i < 50; i++)
                        nArray[i] = i * 0.3;

float static addEven(int n)          //Add even numbers in the array
            cout << "n =" << n << endl;
            if (n == 0)
                        return nArray[n];
                        return nArray[n] + addEven(n - 2);

float static addAll(int n)  //Add all numbers in the array
            cout << "n =" << n << endl;
            if (n == 1)
                        return nArray[n];
                        return nArray[n] + addAll(n - 1);

float static addEvenAll(int n)      //Call different procedures depending on the value of n
            if (n % 2 == 0)   //Cast n to an integer and then determine if it is an even number
                        return addEven(n);
                        return addAll(n);

void main()        //main function is the entry point of the program
            float n, sum;
            setArray();         //Initialize the array
            cout << "If an odd number is entered, odd numbers will be added, otherwise, even "
                        "numbers will be added." << endl;
            cout << "Please enter a number between 1 and 49." << endl;
            cin >> n;
            while ((n < 1) || (n > 49))  //Check validity of the input
                        cout << "Please enter a number between 1 and 49." << endl;
                        cin >> n;
            sum = addEvenAll(n);
            cout << "sum = " << fixed << sum << endl;




You have already created the separate parts of the game. Now it is time to put them together. Your final program should have a graphic start menu, a high score log kept in a separate file, an options menu, and an asteroids-type game.


Take the menu from lab 2 and turn it into a graphical menu (you can still use key presses to select options rather than arrow keys and enter or mouse clicks). Use lab 4 (which uses lab 3) to keep track of the score. Use lab 5 to play the game. After the player has been lost all of their life points, direct them to enter their name. Update the high score list (if necessary), save to the high score file and display the updated high score list. After pressing a key on the keyboard return them to the main menu.


1. Write a program that will generate an array of ten random 32-bit integers, and that will display on the monitor the numbers followed by either the words “ has the fourth bit set “ or “does not have the fourth bit set”. Use the test instruction to test whether the bit is set or not. Note: The fourth bit from the least significant end ( the 8’s position).

2. Write a program that will input a number from the keyboard, and then display the number in binary form, as well as the number of one’s in the number. Hint: Shift the value left ( or right ) and count the number of times the carry bit is set.

3. Write a program that will input two numbers from the keyboard and execute each of the signed and unsigned multiply and divide instructions. For each instruction, the program should display output that clearly shows the result of the instructions. Test and submit with data that shows the use of the edx register and the results with combinations of negative and positive values. Note that it is useful to use small negative numbers to create large unsigned values.




Since graphics plays a very important role in modern computer application, it is important to know more information about its hardware and software operations. Despite the simplicity of the concepts required for this assignment, upon completion of this lab, you will be able to:
1) Understand what a video buffer is and how to access it.
2) Understand how to control the contents of the video buffer.
3) Use the computer’s onboard clock.
4) Detect keyboard and mouse clicks and decode the pressed key.
5) Write user defined interrupt service routines (ISR).


1) Software Interrupts
Software interrupts are generated by programs, not by hardware. Since the first PCs with DOS and BIOS functions, all modern operating systems, including MS Vista, Linux, etc., still use software interrupts to access libraries, invoke systems functions, and connect with the operating system kernel. Software interrupts are caused by a\ program issuing the software interrupt instruction to make the processor act as though it receives a hardware interrupt. This method is convenient for accessing OS services independently of any program location in memory. Although software interrupts may not be as critical as hardware interrupts, they are called frequently and must be efficiently implemented. A software interrupt can be seen as an indirect call to a procedure. In this case, the data structure were the address of such procedure is stored is called interrupt vector table, or just interrupt vector. The interrupt vector is stored in the lowest 1024 bytes of system memory with 4 bytes (CS and IP) reserved per interrupt, for a total of 256 distinct interrupt vector entries. Upon examining this table, one will notice that there are some interrupts which are not used. Those vacancies are reserved for the users. In other words, INT 60H to INT 67H are\ reserved for the user defined interrupt services. Because modifying an interrupt vector directly is not compliant with the specifications of any OS, the MSDOS also provides a safe way to consult and change the interrupt vector entries using interrupt 21H function numbers 25H and 35H. Here is what they look like:
a) Set interrupt vector:
i) Calling registers:
(1) AH: 25H
(2) AL: Interrupt number
(3) DS: Code segment of the interrupt service routine
(4) DX: Offset of the interrupt service routine
ii) Return registers: None
b) Get interrupt vector:
i) Calling registers:
(1) AH: 35H
(2) AL: Interrupt number

ii) Return registers:
(1) ES: Code segment of the interrupt service routine
(2) BX: Offset of the interrupt service routine
Real Time Clock
Many applications require some sort of Idle Loop in order to “kill time” while the user is preparing to hit a key, move the mouse, etc… Most of the times, the OS itself will need to use such an idle loop for its internal operation (e.g. mouse movements). For this assignment we will use a BIOS system call for obtaining the system time in the form of a counter that is incremented every 55ms. This system function is:
a) INT 1Ah, with calling register AH = 00h and return registers:
i) AL: Midnight flag, 1 if 24 hours passed since reset
ii) CX: high order word of tick count
iii) DX: low order word of tick count
b) INT 1Ah, with calling register AH = 02h and return registers:
i) CH: Hour (BCD)
ii) CL: Minute (BCD)
iii) DH: Second (BCD)
iv) CF: 0, Clock is working; 1, No clock.
c) There is a DOS function call – service 2Ch – which can also yield the system time. However, we don’t use it in TSRs because of DOS reentrance considerations.
d) To create delay, you can also create nested loops. You can calculate the delay based on the number of clock cycles needed to execute a command and the number of cycles per second.
e) There is also a function in int 15h that can be used to wait. Keyboard Detection From previous labs, we are familiar with keyboard character/string input. However, for some situations, we may need to detect if there is any input/keyboard hit, rather than wait until an input has been actually made. For that, we will use INT 16h. Mouse Detection For mouse functions we use INT 33h. Video Buffer The video buffer is accessed via INT 10h. Video RAM is an element common to all video cards, whether this is programmed in text mode or graphics mode. While video RAM is organized differently for different video cards when this is set to graphics mode, the structure is virtually identical for all video cards when in text mode. Video
cards only need memory segments A and B, starting at segment addresses A000H and B000H, which requires only 64K. Monochrome video cards (i.e., MDA and Hercules \ use video RAM in the range B000:0000 to B000:7FFF. Color Cards (i.e., CGA, EGA and VGA) use video RAM beginning at physical address B8000H or A0000H. In DOS and Win98, whenever the card is in text mode (i.e. inside the command prompt for Win98), the screen is divided into 80 columns. For Win XP and up, one is able to
change the number of columns and the number of rows. The top-left most character on the screen starts at RAM address 0B8000:0. Each character uses 2 bytes, 16 bits. The first four bits are used to indicate background color. The second four indicate foreground color. The final byte indicates the ASCII character. The four color bits are (most to least significant) brightness, red, green, and blue.


This experiment consists of two parts. First, write a set of 80x86 assembly language routines and a main program that can be installed as a TSR (Terminate and Stay Resident) program and that defines an user-defined ISR so it can be called by other programs. This user-defined interrupt service should contain the following functions:
1) INT 63H
a) Function 01H ==> clear screen
b) You may add other functions
2) INT 64H
a) Function 03H ==> draw a rectangle
b) Call registers:
i) AH: 03H
ii) AL: color code
iii) CX: X coordinate of the starting point
iv) DX: Y coordinate of the starting point
v) SI: X coordinate of the ending point
vi) DI: Y coordinate of the ending point
3) INT 65H ==> restore the original interrupt vector.
4) The interrupt service routine you write should become part of the Disk Operating System (DOS). In other words, it has to reside in the memory and become a Terminate and Stay Resident (TSR) program. INT 27H is recommended for this purpose:
a) Calling registers:
i) DX Offset of last byte plus 1 of the program to remain resident
ii) CS Segment of Program Segment Prefix
b) Search “INT 27” or “Terminate and Stay Resident” on the internet for more info. When a user-defined interrupt service (or any ISR) is called, the DS (data segment) is unchanged at the entry point of the ISR. Therefore, it will point to the original data segment of the calling function. As a result, the DS must be manually initialized before access to any data inside the ISR can be made. Also, remember that an ISR is supposed to save (“PUSH”) all registers that it will modify during the course of its execution. These
registers must be restored at exit. For the second part of the lab, extend your original ISR to perform the following services:
1) Clear screen.
2) Display the game field and the score on the top right corner of the screen.
3) Start the game by shooting multiple projectiles from a random location on the edge of the screen.
4) The space ship, in the middle of the screen, aims, controlled by mouse location, can shoot the projectiles to destroy them using a mouse click to fire bullets in that direction.
5) Each projectile that is destroyed is worth a certain number of points.
6) If the player is hit by a projectile, the space ship loses one of 3 life points.
7) Game would stop when you get hit 3 times.

8) After the game ends it should go back to the game menu. In the next lab we will add high score functionality here.
9) Before exiting, the game should:
a) Clear the screen,
b) Restore the original interrupt vector table and return to DOS.
10) All BIOS function calls (macros) must be included in a macro, BIOS.MAC, while the DOS function calls (macros) must be included in DOS.MAC. The following program structure is suggested:
ASSUME cs:code, ds:code, ss:code
ORG 100h
jmp initialize
; needed memory data here
; begin resident program here
int63 PROC FAR
int63 ENDP
int64 PROC FAR
int64 ENDP
int65 PROC FAR
int65 ENDP
; end resident program here
ASSUME cs:code, ds:code, ss:code
lea dx, initialize
int 27h
code ENDS
END begin


Mips assignment

* Comments in your code are required
* Main Program Operation:
# Your program should first prompt the user for an integer to seed the random number
Generator. “Enter an seed integer for random number generator:”

# Prompt the user to enter an integer to denote the number of values to create, n. The
limit is 100 numbers. “How many numbers to create:”

# Randomly generate and store the n integer values in the range of [-100,500]. Set
argument $a0, the ID of the pseudorandom number generator to 8.

# Write/Call a function to print the array values to the screen, printArray

# Write/Call a function for insertion sort, isort, to sort the values.

# Call printArray again to print the sorted values.

# Prompt the user to enter a filename. “Enter input filename:”

- When reading in a string, the newline character is appended to the end if the string does not fill the buffer. Mars does not allow the newline character to be in the filename in order to open a file. You must delete/remove the newline character . from the filename string prior to opening the file (replace it with NULL).


# Use syscall 13 and 14 to open the file and read from the input file. Print error and reprompt for filename if the file is not found. “Cannot open file.” Make sure to open and close each file properly.

# Write/Call a function, abcCount, to count the frequency of each alphabetic character
in the input file text.

# Print the total number of characters counted.

# Write/Call a function, printTable, to print a table which displays the frequency of each
# Write/Call a function, ROT47Cipher, to encrypt the input text and print it to the screen.
# Prompt the user to enter ‘E’ to exit or ‘R’ to repeat the program. Technical Specification:

The program should contain the following five functions:

# isort: This function implements the iterative insertion sort algorithm. See ( for algorithm details.
* Input parameters: the base address of the array to be sorted, the number of elements in the array
• Return parameter: Total sum of values sorted.

# printArray: This function should print an array of integers to the screen.

* Input parameter: the base address of the array to be printed, number of elements to print.
* Return parameter: None

# abcCount: This function examines each character in the text string and increments the count of the character in the array. Ignore any punctuation and spaces. Assume capital and lowercase letters are equivalent.

* Input parameter: the base address of the array for storing the letter count, the base address of the text string
* Return parameter: The total number of characters processed
# printTable: This function prints a two column table which displays the letter followed by its frequency count.

* Input parameter: the base address of the array for the frequency count, the number of elements to print
• Return parameter: None

ROT47: A substitution cipher in which uses the ASCII table to replace the plaintext with letters, numbers, and symbols. Specifically, the 7-bit printable characters from 0x21 ‘!’ to \ 0x7E ‘~’ (94 in total) are rotated by 47 positions. No special consideration is taken for upper case and lower case, and the space character is left intact. For example, ‘a’ is mapped to ‘2’ and ‘9’ is mapped to ‘h’. (More details can be found here Note: Spaces are left untouched, but all other
visible symbols are encrypted.
* Input parameter: the base address of the text string
* Return parameter: None
# All functions should be called from the main program.
# You may write additional functions if you choose. However, DO NOT deviate from the above function specifications, redefine the parameters and/or functionality of the functions. Implement the functions and program as described above

Data segment:
# You should allocate space in your memory to generate at most 100 integer values, space for 100,000 characters to be read from the text file, and space for storing the frequency of each character. Remember if you allocate your space after strings you must realign your memory such that the integer values are word aligned. Use the .align


# Sample Output:

Enter a seed for the random number generator: 483

Enter the number of values to generate: 5




The sum of all numbers was 753

Enter text filename: input.txt

The file contained 416 characters.

The character frequencies are: (Note that these are not accurate)
The ROT47Cipher of the text is: (HW6-SampleInput)

|C !9:=62D u@88 =:G65[ :? `gfa[ 2E }@] f[ $2G:==6 #@H[ qFC=:?8E@?
v2C56?D[ E96 9@FD6 :? H9:49 $96C:52? 5:65 :? `g`c] w6 H2D @?6 @7 E96
>@DE ?@E:4623=6 >6>36CD @7 E96 #67@C> r=F3[ E9@F89 96 D66>65 2=H2JD
2G@:5 2EEC24E:?8 2EE6?E:@?j 2? 6?:8>2E:42= A6CD@?286[ 23@FE H9@>
H2D <?@H?[ 6I46AE E92E 96 H2D 2 A@=:D965 >2? @7 E96 H@C=5]


And here is a sample answer for one of the questions (not one that was submitted by a student):

C SC 230 – Fall 2012 – Assignment 2 part 2

Due Tuesday October 23, by 15:00

Total Marks = 50 + bonus

1. The problem to be solved and implemented with an ARM assembly language program You are asked to do some image processing on an image composed of characters shaped in an m ´ n grid. For example, a 9 ´ 6 image may look like the one shown in Figure 1, which, more or less, looks like a representation of the letter “F” (you will not be using colour, however).


Figure 1: A 9 x 6 image containing the letter “F” described by “&” characters

This is a subset of the C program in assignment 1. It should focus you to learn ARM programming, since you are completely familiar with the problem and indeed you can use the previous C code as your guide- line pseudo-code when designing your ARM program. Also feel free to use the C code provided as a gen- eral answer to all and posted on the course web pages.

The program is expected to read from an input file an image described by integers, convert each integer appropriately to a character, do some image processing tasks on the character image, and print the results.\ The functional requirements were explained in the previous assignment. Here you will be rotating he image right by 90 degrees and then scaling it up doubling its size using the second (vector-based) algo- rithm which you used in your previous C program. Doing both tasks may prove to be too challenging at this point in your learning path, so you are given a choice of which task to implement and submit. Should you be able to submit both, you will receive an extra 30 marks bonus (that is, instead of 50 marks you could earn 80). Read the expectations carefully, both for the design and for the technical details. An initial template file is also given to you, with some of the code already provided. Make sure you analyze it care- fully and customize it appropriately, including the comments.

2. The overall requirements – pseudo-code and flowchart The easiest way to summarize the flow of your application is to outline the overall functionality with pseudo-code, as in Figure 2. The detailed specifications are given below. Make sure to read carefully all the comments also included in the template file given to you as the initial code structure.

3. The code for the I/O processing Most of the code for I/O processing is given to you in a template file posted on the web pages and called “A2Scale-frame.s”. There an input file, called A2In.txt, is opened, the data is read in, stored and printed. You are absolutely responsible to understand every detail of the code given to you and you are free to modify it if you think it appropriate. Before doing so, it might be a good idea to come consult. Moreover you should learn from the code style what the expectations are for your own code (e.g. docu- mentation and structure).

Page 1

Print a welcome message for the program with identification
Open the input file
Figure 2: Pseudo Code
if problems, print message and exit program
Call “RdInt” to read row size of the image
While it is not end of file {
Call “RdInt” to read column size of matrix
Call “RdImage” to read integer elements of the image IM1 and store them as characters
Call “PrImage” to print the initial image IM1
Do one of: (or both for bonus)
1) Rotate the image IM1 right by 90 degrees, store in IM2 and print it
2) Scale the image IM1 to size x 2 with vector algorithm, store in IM3 and print it
Call “RdInt” to read row size of the image
Exit program: close the input file, print a closing message, exit.

3.1. The input file (similar to assignment 1)
The data in the input text file is in the form of K groups of integers, where each group represents an image. In each group the first 2 integers represent the row size m and the column size n, while the follow- ing m × n integers represent the entries of the image. The input file *must* be called: “A2In.txt”.3.2. Assumptions on the input data
• The maximum sizes of an image are m = 10 for the rows and n = 10 for the columns.
• Given the sizes of an image, it is guaranteed that exactly m × n integers of the correct value (0 or 1) follow, thus there is no need to do any validation on the data. 3.3. Sample input and output A sample input file to be used for testing purposes will be posted on the web pages. The template for the output is similar to assignment 1. If you lost marks in assignment 1 in this regard, make sure to improve. Note the details of the expectations for the format, including name, student number, comments, etc.

4. The specifications: storage and I/O
4.1. Allocating the storage. The declaration for the required storage of an image IM1, the rotated image IM2 and the scaled-up image IM3 must take into consideration that the maximum original dimensions are 10 × 10. Remem- ber that you must allocate the correct number of bytes! You must allocate separate storage for IM2 and IM3. The sizes of the matrices must be called: Rsize1, Csize1 for IM1, Rsize2, Csize2 for IM2,and Rsize3, Csize3 for IM3. 4.2. All I/O processing. The instructions for I/O processing are local to the simulator and are explained in the ARMSim# manual. You are expected to use them correctly and thus understand their interfaces. Additional sam- ple source files, for example reading a list of integers from a file and printing them out both to the Output View screen (which in ARMSim# corresponds to “stdout”) and to a file, are available to you. Help with these processing steps are also given in the lab sessions. The initial file given to you for this assignment includes most of the I/O processing and you can use some of the code there as your guide. 4.3. Reading the dimensions and the integer elements from a file. The image is given as a set of integers (0 or 1) and you are guaranteed that the correct number is pres- ent. End of file is checked only when attempting to read the next row size for the next image. This is
done for you in the initial file.

Page 2

4.4. Transforming the image to characters. Transform each integer read from the file into a 1-byte character, as: a “1” becomes a “&” and a “0” becomes a “+”. Store the characters into a two-dimensional array of characters. All this transforma- tion work is done from within the procedure RdImage introduced above. Thus there is no need to allocate a 2D array of integers. You should allocate the initial 2D array of characters of size 10 x 10, which is the maximum you would ever deal with.
4.5. Printing the image. The image should be printed row by row as characters, thus there is a newline character at the end of each row. This is already done for you in the PrImage procedure, but you should feel free to add custom touches.
4.6. Printing messages. The local ARMSim# instructions for printing a string both to the Output View and to a file are explained in the ARMSim# manual. You are expected to format appropriate messages. Examples are shown in the code already provided to you plus in the lab examples and in the documentation.
4.7. Names for the input file, and for the program file
In the final submission:
• The input file *must* be called: “A2In.txt”
• The program file containing the source code *must* be called: “A2csc230.s” 5. The specifications: implementing the tasks
5.1. Rotating right by 90 degrees Figure out exactly how you are going to copy the elements of IM1 into IM2 correctly. The algorithm can be the same as what you had in the C program, namely copy each column of IM1, from top to bottom row, to each row of IM2, from right to left column. Pay attention to how the elements are\actually stored in memory. While you may view them as a 2D image and thus a 2D array of charac- ters, the storage itself is a linear 1D array of characters, row by row. You need to figure out precisely the distance (offset) between rows and columns and elements, so that you can increment to the next positions appropriately and differently for IM1 and IM2. Test it out with an example by hand before
getting lost in ARM programming. 5.2. Scaling up to twice the size: the vector-based algorithm
The doubling the size of the image is to be done using the second algorithm from your previous assignment. This second algorithm, whose code was actually mostly given to you, operates on the view of the 2D array as a 1D vector (as it is actually stored in IM1) and quadruples its length by pro- ducing a second 1D vector (stored in IM2) where each element is replicated four times. It does this in a manner such that, when the new longer vector is interpreted again as a 2D array, the elements fall in the correct position for doubling a 2D image.1 Be curious and follow through manually a few iteration
of the code for a small array - you will have fun! The computation of the indices is really the challenging part as it appears to require almost all the reg- isters available (but make sure to avoid using R13-R15). Please note also that you need to implement an integer division and there is no division instruction in ARM. Thus the initial given code contains a function called “DIV” which computes the quotient of two integers (in a very basic algorithm) – you have seen this code as an example in the lectures. You need to call this function appropriately.

6. What about the proper use of functions and procedures? The program described above with all the processing done in “main” is not exactly an elegant solution. In fact in the initial template given to you there are some subroutines already, for example the procedure RdImage, the procedure PrImage and the function Div. Each of them uses parameter passing

1. The full code and explanation is also found at:

Page 3
through registers, saves and restores the state of computation (the list of registers) on the stack so that there are no side effects and follows the interface expectation of a regular C compiler (that is, which reg- isters are used for input and output). All this is explained in the lectures, but at the point when this assign- ment is being designed and implemented by you it might be too early for the information to have been
fully absorbed and ready to be used correctly. Thus you are not required to use any new subroutines in your submission. You are required to understand the existing given ones and use them correctly.

However, you may have a perfectly working program plus time and willingness to expand your learning experience. Do go ahead and use functions and procedures and you will receive extra marks, 2 marks for
each extra function. My sample solution uses the following subroutines with these interfaces:
• void RotR(R0:addr IM1; R1:addr IM2; R2:Rsize1; R3:Csize1)
• void CopyColRow (R0:addr IM1; R1:addr IM2, R2:Nrows, R3:Ncols, R4:Coli, R5:Rowj)
• void ScaleUp2a(R0:addr IM2; R1:addr IM3; R2:Rsize2; R3:Csize2)
• void RdImage(R0:addr IM1; R1:Rsize1; R2:Csize1) ® inline code given
• void PrImage(R7:addr image; R2:row size; R3:col size)® given
• int:R0 DIV(R1:dividend; R2:divisor)® given
Just make sure that you implement your functions correctly, where all needed information is passed only
through the parameters, there is no usage of global data, no loading of data from memory, all locally used
registers are saved and restored so that there are no side effects (that is, proper modularity and encapsula- tion). To help you in this regard, the template code given to you includes some framework for possible

7. A systematic approach
1. A2csc230V1.s: Start with the code given to you for reading and printing images. Analyze it in depth, assemble it and make sure you understand what is going on. Be patient and step through it.
2. A2csc230V2a.s: Design and insert the code for rotating right 90 degrees [30 marks for correct execution]. If you copy a column to a row, as in the previous implementation, start by writing the code to copy a single column to a single row. Once that portion is working, then introduce the control loop to copy every column in turn to every row. Print the new image.
3. A2csc230V2b.s: Design and insert the code for scaling up and print the results [30 marks for correct execution]. Be careful about computing the desired indeces into the arrays step by step care- fully, without overrriding registers whose values you need later and, possibly, minimizing storing information in extra variables in memory.
4. A2csc230.s: Clean up all documentation and test completely. Submit. [20 marks for quality of code, i.e. structure and design].
5. A2csc230.s: Change some of the code in the main program to be encapsulated into functions or procedures. You will get 2 bonus marks for each of Rotate, ScaleUp2, RdImage [6 bonus]. Note that you already have the code for RdImage thus it is a matter of moving that code segment and encapsulating it properly into a procedure.
6. Make sure that for every step above you make a separate version of your program file without over- writing your previous results. Test each step. Do not proceed to the next task until you have fully fin- ished the previous ones and you have come to ask questions if there are any doubts. It is then easy to
backtrack to a tested working version when debugging.

8. Documentation
1. Do not forget about good documentation throughout, that is, meaningful and logical comments, not simply stating the sometimes obvious semantics of each line of code. This is particularly important at the assembly language level where an instruction may look incredibly simple and yet it badly needs to

Page 4

be commented in order to communicate the meaning of the action. For example, the instruction MOV R1,#0 obviously can be read directly as an assignment of 0 to register R1 and in that it needs no fur- ther explanation. Yet its real meaning may be the initialization of i=0 as a counter for a loop, some- thing which would be obvious in a high level language where the same counter would be given a self documenting name and would be inside an obvious header for a “for loop”, for example, but here everything is only a register number. It is extremely easy to get lost in your own code! Your source code file must start with the following items, shown by example:
V00123456 Albert Einstein
October 2012
CSC 230, Assignment 2, Part 2 */
/* This program implements...a short description here... */
Every procedure or function must start with the following items, including a description of what it does, a description of how it does it (especially if non intuitive), and more documentation on inputs/ outputs and side effects when appropriate:
=== int:R0 DIV (int Dividend:R1; int Divisor:R2)
Given the numbers M >= 0 and N >0,
return the integer quotient M/N while
using repeated subtractions
The remainder is not returned
Input parameters:
@R1 = dividend
@R2 = divisor
Outputs: the integer quotient in R0
Algorithm description......and pseudo code......
That is, every subroutine must have a clearly readable start point, followed by a clear explanation of its purpose, followed by documentation of its parameters and finally its initial header, before any exe- cutable instructions appear.

9. Evaluation Correct execution of your program will gain you 30/50 of the total marks. The other 20/50 marks of the evaluation is based on the quality of your code which will be analyzed for design, structure, documenta- tion, clarity, organization, modularity.

10. What to hand in Your working implementation in a file named “A2csc230.s” by electronic submission through the Connex site.

11. The Initial Template To help you out on the correct path, an initial file called “A2Scale-frame.s” is posted for you. Make sure you understand the code in this file really well, and then proceed to make new versions of it, in sep- arate files, with your new code. Check the web pages often, attend the labs and the lectures, as more examples and information to help you along will become available.
It is strongly suggested that, at a minimum, you read all this assignment and understand the template code before Quiz #3 in Lab 6. That lab will help you for the implementation.

12. Sample output Use the correct execution and output of your own C program from assignment 1 or the solution posted to check your results.
And the answer was:

@ File: A2Scale-frame.s
@ Read only images from A1In.txt
@ Refer also to the original solution in C
@ This solution is based on the template originally given

@ Print a welcome message for the program start with identification
@ Open the input file
@                     if problems, print message and exit program
@ Call RdInt to read row size of the image
@ While it is not end of file {
@         Call RdInt to read column size of matrix
@         Call RdImage to read integer elements of the image IM1 and
@                       store them as characters
@         Call PrImage to print the initial image IM1
@         Do one of: (or both for bonus)
@                     1) Rotate the image IM1 right by 90 degrees, store
@                  in IM2 and print it
@                     2) Scale the image IM1 to size x 2 with vector
@                  algorithm, store in IM3 and print it
@         Call RdInt to read row size of the image
@ }
@ Exit program: close the input file, print a closing message, exit.

@ Code for File I/O taken from IO_Example2a.s

            .equ      MAXROW, 10 @ Maximum of 10 rows
            .equ      MAXCOL, 10 @ Maximum of 10 columns
@ Use nice labels for the hex SWI codes:
            .equ      SWI_Open,       0x66     @ open a file
            .equ      SWI_Close,       0x68     @ close a file
            .equ      FileInputMode,  0          @ Input mode for file
            .equ      SWI_RdInt,        0x6c     @ Read an Integer from a file
            .equ      SWI_PrChr,       0x00     @ Write a byte as an ASCII char
                                            @ to Output View
            .equ      SWI_PrStr,        0x69     @ Write a null-ending string
            .equ      SWI_PrInt,         0x6b     @ Write an Integer
            .equ      Stdout,             1          @ Set output mode to be Output
            .equ      SWI_Exit,          0x11     @ Stop execution
            .equ      AMP_CHAR,  0x26
            .equ      PLUS_CHAR, 0x2B
            .global _start

@@@ Refer to IO Primer in ARMSim# and to Labs: IO_Example2a.s and to
@@@ QUIZ 3
@@@ Refer also to the C code from Assignment 1 (answer from instructor
@@@ posted)
@ ===============================================================
@         Open the input file for reading
@                     if problems, print message to screen and exit
            ldr        r0,=InputFileName         @ set Name for input
                                            @ file
            mov      r1,#0                             @ mode is input
            swi       SWI_Open                    @ open file for input
            bcs       InFileError                     @ Check Carry-Bit (C): if= 1
                                            @ then ERROR
            ldr        r1,=InputFileHandle        @ if OK, load input file handle
            str        r0,[r1]                            @ save the file handle  

@ ===============================================================
@         Print an initial message for the program opening
@   Print a header message with your name and student number
            mov      r0, #Stdout
            ldr        r1, =HelloMsg
            swi       SWI_PrStr                     @ R0:target, R1:msg

@ =================================================
@         Read row size from the input file
            ldr        r0, =InputFileHandle       @ load the input file handle
            ldr        r0, [r0]               @ R0 has file handle    
            swi       SWI_RdInt                     @ R0=Rsize1
            bcs       Closure                         @ if c bit is set, EOF reached
IOLoop:                                                            @ else read column size and image
            mov      r2, r0                 @ R2= #rows
            ldr        r0, =InputFileHandle       @ load the input file handle
            ldr        r0, [r0]
            swi       SWI_RdInt                     @ R0=Csize1
            mov      r3, r0                 @ R3= #cols
            ldr        r1, =Rsize1
            str        r2, [r1]                       @ store rows
            ldr        r1, =Csize1
            str        r3, [r1]                       @ store cols
@ =================================================
@         Read exactly (Rsize1)x(Csize1) integers as elements for an image,
@         convert each correctly to characters, and store as 2D char array
@ ===== the code segment below would be better converted to
@                     a procedure void RdImage(R0:file handle;R1=&IM1;
@                     R2=Rsize1;R3=Csize1)

            ldr        r0, =InputFileHandle
            ldr        r0, [r0]               @ R0= & filehandle
            ldr        r1, =IM1                        @ R1 = & original image
            @ R2 has #rows
            @ R3 has #cols
            bl      RdImage
@ =================================================
@         Print original image as characters with headings
            mov      r0, #Stdout
            ldr        r1, =OrIm
            swi       SWI_PrStr                     @ R0:target, R1:msg    
@ Set up the imput parameters to call PrImage to print the image
            ldr        r1,=IM1                     @ R1 = address of IM1
            ldr        r2,=Rsize1
            ldr        r2,[r2]                        @ R2 = Rsize1
            ldr        r3,=Csize1
            ldr        r3,[r3]                        @ R3 = Csize1
            BL        PrImage                                @ PrImage(&IM1:R1,
                                            @ RowSize:R2,ColSize:R3)
@ =================================================
@ Rotate right 90 degrees: given r rows and c cols
@ copy each col of IM1 from top to bottom
@         to each row of IM2 from right to left, that is:
@ copy col 0 of IM1 (from 0 to r-1) to row 0 of IM2 (from c-1 to 0)
@ copy col 1 of IM1 from 0 to r-1 to row 1 of IM2 from c-1 to 0
@  . . .
@ copy col c-1 of IM1 (from 0 to r-1) to row c-1 of IM2 (from c-1 to 0)

        ldr     r0,=IM1                 @ R0:source image
        ldr     r1,=IM2                 @ R1:rotated image
        mov     r2,#MAXROW*MAXCOL       
        ldrb    r3,[r0],#0
        mov     r3,#'?'
        strb    r3,[r1],#0
        add     r0,r0,#1
        add     r1,r1,#1
        subs    r2,r2,#1
        bne     RtDone
        ldr     r0,=IM1                 @ R0:source image
        ldr     r1,=IM2                 @ R1:rotated image
        ldr     r2,=Rsize1
        ldr     r2,[r2]                 @ R2 = Csize1 = Rsize of IM2
        ldr     r3,=Csize1
        ldr     r3,[r3]                 @ R3 = Rsize1 = Csize of IM2
        BL      RotR          
@ =================================================
@ Print Rotated IM2 as characters with headings
            mov      r0, #Stdout
            ldr        r1, =RtIm
            swi       SWI_PrStr                     @ R0:target, R1:msg    
@ set up to call routine to print the image
            ldr        r1,=IM2                     @ R1 = address of IM2
            ldr        r2,=Csize1
            ldr        r2,[r2]                        @ R2 = Csize1 = Rsize of IM2
            ldr        r3,=Rsize1
            ldr        r3,[r3]                        @ R3 = Rsize1 = Csize of IM2
            BL        PrImage                                @ PriImage(&Im2:R1,Rsize1:R2,Csize1:R3)
@ =================================================
@ Scale up to twice the size with vector algorithm
@ Compute the scaling ratio as integers*/
            @ Note: added +1 to account for rounding problem
            @ x_ratio = (int)((Csize1<<16)/Csize2)+1;
            @ y_ratio = (int)((Rsize1<<16)/Rsize2)+1;
            @ Compose the new 1D vector of correct size - twice here
            @         for (i=0;i<(Rsize2);i++) {
            @                     for (j=0;j<Csize2;j++) {
            @                                 x2 = ((j*x_ratio)>>16);
            @                                 y2 = ((i*y_ratio)>>16);
            @                                 temp2[(i*Csize2)+j] = temp1[(y2*Csize1)+x2];
            @                     }
            @         }

        ldr     r0,=IM1                 @ r0 = source image
        ldr     r1,=IM3                 @ r1 = doubled image
        ldr     r2,=Rsize1
        ldr     r2,[r2]                 @ r2 = # rows
        ldr     r3,=Csize1
        ldr     r3,[r3]                 @ r3 = # cols
        bl      ScaleUp2a               @ ScaleUp2a(IM1,IM3,rows,cols)
@        add     r5,r2,r2
@        add     r6,r3,r3
@        mul     r3,r2,r3
@        movs    r3,r3,lsl #2
@        mov     r2,#'*'
@        strb    r2,[r1],#0
@        add     r1,r1,#1
@        subs    r3,r3,#1
@        bne     fill
@ =================================================
@ Print Scaled up IM3 as characters with headings
            mov      r0, #Stdout
            ldr        r1, =Sc2Im
            swi       SWI_PrStr                     @ R0:target, R1:msg    
@ set up to call routine to print the image
            ldr        r1,=IM3                     @ R1 = address of IM3
            mov      r2,r5                          @ R2 = Rsize2 of IM3
            mov      r3,r6                          @ R3 = Csize2 of IM3
        BL            PrImage                                @ PriImage(&Im3:R1,Rsize1:R2,Csize1:R3)
@ === read next image row size and check for end of file
            ldr        r0,=InputFileHandle        @ load the input file handle
            ldr        r0, [r0]
            swi       SWI_RdInt                     @ Rsize1=R0   
            bcs       Closure                         @ if c bit is set,
                                            @ EOF  reached
            bal        IOLoop                         @ else keep looping
@ === Exit segment =================================
@ Print a final message to screen 
            mov      r0, #Stdout
            ldr        r1, =Bye
            swi       SWI_PrStr                     @ R0:target, R1:msg

@ Close the input file
            ldr        R0, =InputFileHandle    @ get address of file handle
            ldr        R0, [R0]                @ get value at address
            swi       SWI_Close
            swi       SWI_Exit
            mov      R0,#Stdout                   @ to screen     
            ldr        R1, =FileOpenInpErrMsg 
            swi       SWI_PrStr                     @ display error message
            bal        Exit                    @ give up, exit


@ =================================================
@ PrImage(&ImageChar:R1,RowSize:R2,ColSize:R3)
@ Print a 2D array of char row by row
@ Input parameters:
            @         R1 = address of 2D array
            @         R2 = # rows
            @       R3 = # columns
@ Output parameters: None
@ Assumptions: elements are stored consecutively
            STMFD sp!,{r0-r5,lr}               @save registers
            mov      r4,r1                              @r4=&image to be printed
            mov      r5,r3                              @ set column counter
            ldrb      r0,[r4],#1                       @ get char to be printed
            swi       SWI_PrChr                    @ print it
            subs     r5,r5,#1             @ next element, same row?
            bne       COLLOOP
            mov      R0,#Stdout                   @ mode is Output view
            ldr        r1, =EOL                       @ end of line
            swi       SWI_PrStr
            subs     r2,r2,#1             @ next row?
            bne       ROWLOOP      
            ldmfd   sp!,{r0-r5,pc}

@ =================================================
@ RdImage(R0:file handle;R1=&IM1;R2=Rsize1;R3=Csize1)
@ Read a 2D image array
@ Input parameters:
        @       R0 = file handle
            @         R1 = address of 2D array
            @         R2 = # rows
            @       R3 = # columns
@ Output parameters: None
@ Assumptions: elements are stored consecutively
            STMFD sp!,{r0-r5,lr}               @save registers
            mul       r4,r2,r3              @ compute number of elements
                                            @ to read
            mov      r5,r0                              @ save Input file handle
                                            @ for re-use
            mov      r0,r5                              @ get input file handle
            swi       SWI_RdInt                     @ R0=element as integer
            cmp      r0, #0                            @ start coverting to char
            beq      ReadZero
            mov      r0, #AMP_CHAR                    @ if int = 0, convert to &
            bal        SaveChar
            mov      r0, #PLUS_CHAR                  @ else convert to +
            strb      r0,[r1],#1
            subs     r4,r4,#1         @ decrement loop counter
            bgt       ReadLoop
            ldmfd   sp!,{r0-r5,pc}


@ =================================================     
@ === int:R0 DIV (int Dividend:R1; int Divisor:R2 )
@ Given the numbers M >= 0 and N >0, 
@         return the integer quotient M/N  while
@         the remainder is not returned
@         using repeated subtractions
@ Input parameters:
            @         R1 = dividend
            @         R2 = divisor
@ Outputs:       the integer quotient in R0
@ Algo: Repeatedly subtract the divisor N from M, as in M = M - N.
@ Count the number of iterations Q until M < 0.
@ This is one too many iterations and the quotient is Q - 1.
@ The remainder is M + N, the last positive value of M.

@ Pseudo-code 1
@         integers: m, n, quotient, remainder
@         quotient = 0
@         DO
@                     quotient = quotient + 1
@                     m = m - n
@         WHILE m >= 0
@         quotient = quotient - 1
@         remainder = m + n

            stmfd   sp!,{r1-r4,lr}
            mov      r3,#0                             @r3 = quotient = 0
            add      r3,r3,#1             @r3 = quotient + 1
            subs     r1,r1,r2              @compute m - n
            bpl       subloop
            sub       r3,r3,#1             @r3 = correct quotient
            add      r1,r1,r2              @r1 = remainder
            mov      r0,r3                              @ return quotient
            ldmfd   sp!,{r1-r4,pc}

@ =================================================
@ === void RotR (IM1, IM2, Rsize1, Csize1)
@ Given the 2D char array of Image1 and its dimensions,
@         construct the rotated by 90 degree image in Image2
@ Input parameters:
            @         R0 = addr IM1
            @         R1 = addr IM2
            @         R2 = Rsize1
            @         R3 = Csize1
@ Outputs:       none
@ Loop for each column i of IM1 and copy it
@ from top to bottom to row i of IM2 right
@ to left. Control loop for each column here,
@ then call CopyColRow in turn to do the copying
@ of one column at a time
            STMFD sp!,{r4,lr}
            mov     r4,r3                   @ loop should be in reverse
            subs    r4,r4,#1                @ loop backwards
            bmi     DoneRotR                @ i < 0
            mov     r5,r4                   @ call uses i,i
            BL      CopyColRow              @ CopyColRow(Image1,Image2,
                                            @ Nrows,Ncols,i,i)
            BAL     LoopRotR
            LDMFD sp!,{r4,pc}

@ =================================================     
@ === void CopyColRow (M1, M2, Nrows, Ncols
@                                 Coli, Rowj)
@ Copy Coli of M1 from 0 to Nrows-1
@         to Rowj of M2 from Ncols-1 to 0
@         that is, copy a specified column of M1 from top
@         element to bottom into the specified single row of M2
@         from right to left
@ Input parameters:
            @         R0 = addr M1
            @         R1 = addr M2
            @         R2 = Nrows
            @         R3 = Ncols
            @         R4 = Coli
            @         R5 = Rowj
@ Outputs:       none
            STMFD sp!,{r0-r5,lr}
            add     r0,r0,r4                @ address of M1[0][r4]
            mul     r5,r2,r5                @ r5 = [RowJ][0] offset
            add     r1,r1,r5                @ address of M2[ROWJ][0]
            add     r1,r1,r2                @ address of M2[ROWJ][Nrows]
            mov     r5,r2                   @ ri = Nrows;
            sub     r3,r2,r3                @ r2 = difference between
            sub     r2,r2,r3                @ nrow/ncol to add each line
        sub     r1,r1,#1                @ M2[ROWJ][Nrows-r1-1]
        ldrb    r4,[r0],#0
        strb    r4,[r1],#0              @ Mat2[Rowj][Nrows-ri-1] = Mat1[ri][Coli];
        add     r0,r0,r2                @ next row of M1
        subs    r5,r5,#1
        bne     LoopCopyColRow          @ end for loop
@         for (ri =0; ri < Nrows; ri++) {
@                     Mat2[Rowj][Nrows-ri-1] = Mat1[ri][Coli];
            LDMFD sp!,{r0-r5,pc}

@ =================================================
@ === void ScaleUp2a (IM1, IM2, Rsize1, Csize1) == OPTIONAL
@ Given the 2D char array of IM1 and its dimensions,
@         construct IM2 of twice its size using a vector-based algorithm
@ Input parameters:
            @         R0 = addr IM1
            @         R1 = addr IM2
            @         R2 = Rsize1
            @         R3 = Csize1
@ Outputs:       R5, R6 = new rows,cols
@ Description and pseudo code: ......
        @       For each row:
        @          For each column:
        @               byte = [r0]++
        @               store 4 copies of byte
        @          skip to next output line

        add     r5,r2,r2                @ r5 = Rsize1 * 2
        add     r6,r3,r3                @ r6 = Csize1 * 2
            STMFD sp!,{r1-r7,lr}          @ save modified r5,r6
                                            @ so we can use during
                                            @ the routine
            add     r4,r1,r6                @ second output line
            mov     r5,r3                   @ for each column
        ldrb    r7,[r0],#1              @ byte = [r0]
        strb    r7,[r1],#1              @ copy byte 4 times
        strb    r7,[r1],#1              @ make 2nd copy
        strb    r7,[r4],#1              @ make copy on next line
        strb    r7,[r4],#1              @ make 2nd copy on next line
        subs    r5,r5,#1                @ end for col
        bne     ColLoopScaleUp2a
        add     r1,r1,r6                @ skip a line
        add     r4,r4,r6                @ skip a line
        subs    r2,r2,#1                @ end for row
        bne     RowLoopScaleUp2a
            LDMFD sp!,{r1-r7,pc}
@ =================================================     
Rsize1: .word    0
Csize1: .word    0
InputFileHandle:            .skip     4
IM1:      .skip     MAXROW*MAXCOL
IM2:      .skip     MAXCOL*MAXROW           @Rotated version
IM3:      .skip     MAXROW*MAXCOL*4         @Scaled version
InputFileName:              .asciz    "A2In.txt"
FileOpenInpErrMsg:       .asciz    "Failed to open input file \n"
OrIm:    .asciz    "\n\n @@@ The original image contains: \n"
RtIm:    .asciz    "\nThe rotated image contains: \n"
Sc2Im:  .asciz    "\nThe scaled up image contains: \n"
EOL:     .asciz    "\n"
            .ascii    "\n Albert Einstein - Student Number V00xxxxxx \n"
            .ascii    "\n File = A2ImageScaleFinal.s - Fall 2012 \n"
            .ascii    "\n CSC 230, Assignment 2 \n\n"
            .asciz    "Starting: \n"
Bye:     .asciz    "\n All done - Bye! \n"