The Ghosts of Buffer Overflow

An enormous amount of due diligence.  That’s what it will take to overcome one of the most common computer security vulnerabilities that has been vexing cyberspace for years, according to John Clark of the University of York.  “To make buffer overflows a thing of the past will require an enormous amount of due diligence – systematic, thorough code review and testing – as new code is written,” he wrote. “But the sheer volume of code that exists, such as the potentially 15-year-old lines that include this flaw, never mind that being written anew, should give some indication of the scale of the task.” Clark calls buffer overflows “the ghosts that will always be among us”.

Exceeding Data Boundaries

Also called a buffer overrun, a buffer overflow is a system flaw that occurs when data input exceeds memory allocation. It could very well be unintentional, but clever bad guys can also exploit the behavior to inject malicious code. James P. Anderson wrote about it in 1972 in a Computer Technology Planning Study for the U.S. Air Force:  “The code performing this function does not check the source and destination addresses properly, permitting portions of the monitor to be overlaid by the user. This can be used to inject code into the monitor that will permit the user to seize control of the machine.”

Also called a buffer overrun, a buffer overflow is a system flaw that occurs when data input exceeds memory allocation.

Anderson was talking about the movement of information between system and user space. To get a better understanding of the process, we should take a look at how buffers are used in computer programming.  A buffer is an area of physical memory storage. Techopedia calls a buffer “a temporary holding area for data while it’s waiting to be transferred to another location”. The term can apply to a wide variety of memory handling functions. Buffers allow programs to simultaneously read input and send output to another data space. Buffering is an integral technique in the control of data flow.

The problem of buffer overflow occurs there’s too much data input and, instead of truncating the data, it is placed into an adjacent memory space. A computer does just what you tell it to do. If you don’t control the data input, it will just keep on writing to memory. You have to set boundaries.

A segmentation fault causes programs to crash. “Segfaults are caused by a program trying to read or write an illegal memory location,” says an Indiana University knowledge base. It is also known as an access violation. By default a program will normally terminate when a segmentation fault occurs. If you’re lucky, that will happen when someone tries to exploit buffer overflow vulnerabilities.  But not always.

How to Hack

White hat hackers like to play. And they love to show the world how much they know about breaking into systems. If you really want to learn how to perform a buffer overflow hack, there are plenty of resources on the internet to help you. The Open Web Application Security Project (OWASP) gives a couple of examples of a buffer overflow attack. Here’s Example 1, written in the C language:


#include <stdio.h>


int main(int argc, char **argv)


{


char buf[8]; // buffer for eight characters


gets(buf); // read from stdio (sensitive function!)


printf("%s\n", buf); // print out data stored in buf


return 0; // 0 as return value


}

In their analysis, OWASP says the program “does no checks against overflowing the size assigned to this buffer”. The program calls for a buffer of eight characters. But in their compilation, OWASP was able to get the program to print twelve characters, overflowing the buffer by four characters. In Example 2, they were able to overwrite the extended instruction pointer (EIP) register.

“These kinds of errors are very easy to make,” says the OWASP writer. “For years they were a programmer’s nightmare. The problem lies in native C functions, which don’t care about doing appropriate buffer length checks.” At the risk of borrowing too heavily, we’ll list the C functions that OWASP gives as being vulnerable to buffer overflow exploitation:

  • gets() -> fgets() – read characters
  • strcpy() -> strncpy() – copy content of the buffer
  • strcat() -> strncat() – buffer concatenation
  • sprintf() -> snprintf() – fill buffer with data of different types
  • (f)scanf() – read from STDIN
  • getwd() – return working directory
  • realpath() – return absolute (full) path

 

This is where the idea of due diligence comes in. A successful programmer has to do a lot more than make a program work. He’s got to make it secure. Thankfully there are safe equivalents for some of these older C functions — such as using fgets() instead of gets(). But that’s just one way to deal with unintentional buffer overruns. Dealing with purposeful mischief is something different.

Who Knows Why?

“The goal of a buffer-overflow exploit is to disrupt a desired program flow,” wrote Isaac Gerg in “An Overview and Example of the Buffer-Overflow Exploit” in a Department of Defense Newsletter. “Specifically, buffer overflows often attempt to gain entire or partial control of a system or daemon.” Gerg provides a good explanation of buffer-overflow theory and the use of memory address space. Buffer overflow attackers will take over your computer if you let them.

It’s not within the scope of this article to understand the motivations behind such actions. We can only offer broad outlines and examples. The Morris Worm was launched in 1988 by MIT graduate student Robert Tappan Morris. “Nor was the intent of Morris clear,” writes Christopher Kelty of UCLA. “Some speculate that the release was either premature or accidental.” The Morris worm didn’t destroy computers — it only slowed them down. Kelty explores the implications of the exploit and Morris’ criminal prosecution.

The Code Red Worm has been listed as one of the “10 Worst Computer Viruses of All Time”. If affected Windows 2000 and WindowsNT machines by initiating a distributed denial of service (DDoS) attack on the White House in 2001. A contemporary analysis of the exploit provided a rundown:

Attack www.whitehouse.gov functionality

—————————————

Sooner or later every thread within the worm seems to shift its attacking focus to www.whitehouse.gov.

Create socket and connect to www.whitehouse.gov on port 80 and send 100k bytes of data (1 byte at a time).

CODEREF: seg000:000008AD WHITEHOUSE_SOCKET_SETUP

Initially the worm will create a socket and connect to 198.137.240.91 (www.whitehouse.gov/www1.whitehouse.gov) on port 80.

CODEREF: seg000:0000092F WHITEHOUSE_SOCKET_SEND

If this connection is made then the worm will create a loop that performs 18000h single byte SEND()’s to www.whitehouse.gov.

CODEREF: seg000:00000972 WHITEHOUSE_SLEEP_LOOP

After 18000h SEND()’s the worm will sleep for about four and a half hours. It will then repeat the attack against www.whitehouse.gov (go to step 1 of attack www.whitehouse.gov functionality).

You can see how a lot of repeated “sends” could tie up White House computers. Obviously somebody wanted to see how much power they had over government systems. Again, we’ll have to defer to psychology experts regarding why they did it.

Due Diligence

Perhaps the most critical aspect of preventing buffer overflow is the choice of programming language.  It should have automatic bounds checks and automatic memory management. And the language should be strongly typed, which means that each type of data is clearly defined. As TechTarget states, a strongly typed computer language “imposes a rigorous set of rules on a programmer and thus guarantees a certain consistency of results”. Languages may be considered “safe” or “unsafe” in this respect.

Perhaps the most critical aspect of preventing buffer overflow is the choice of programming language.

OWASP provides a table that deals with these programming language characteristics.  They also go on to list general prevention techniques, including code auditing, developer training, compiler tools, and patches. And the safe functions already mentioned are also part of the mix.

A 1995 announcement by Imperial College about “Bounds Checking for C” offers some interesting insight. “We’re very excited about this: we can check every time a program uses a pointer or array and ensure that only valid references are allowed.” The authors chose to deal with the vulnerabilities of the C language rather than discarding it. Programmers today may prefer other languages to C. But that would not prevent buffer overflow exploitation of programs that were written long ago.

The current Wikipedia entry for buffer overflow protection does a fair treatment of the subject. It says that developers can use canaries (like canaries in a coal mine) to detect whether data has been corrupted. These are values placed in the data flow that can monitor buffer overflows. The article also provides a list of implementations that include buffer overflow prevention measures:

  • GNU Compiler Collection (GCC)
  • Microsoft Visual Studio
  • IBM Compiler
  • Clang/LLVM
  • Intel Compiler
  • Fail-Safe C
  • StackGhost (hardware-based)

 

A blog post by Veronica Robinson gives a good summary of the prevention we are looking for: “A buffer overflow attack can be prevented or mitigated with proper coding practices or boundary checking on input received from users.”  Firewalls won’t do it. Intrusion detection systems won’t do it. There’s just no substitute for good coding.

Conclusion

As John Clark said, the only way to beat the buffer overflow hacker is by “systematic, thorough code review and testing”. Just when you think a security vulnerability has been conquered, it shows its ugly face once more. Some might have been scared when the Linux “Ghost” appeared in 2015. But the risk didn’t appear to be as great as first believed. Nevertheless, if Clark is to be believed, we should never let down our guard for the ghosts of buffer overflow exploitation.

Total Uptime offers Protection against Buffer Overflow and other vulnerabilities in our Web Application and API Protection service.

Protect your App with our WAAP!

TRY IT FREE

Other articles you might like to read: