reading file line by line in Java with BufferedReader

Reading files in Java is the cause for a lot of confusion. There are multiple ways of accomplishing the same task and it's often not clear which file reading method is best to use. Something that'due south quick and dirty for a small case file might not be the best method to use when you demand to read a very large file. Something that worked in an earlier Java version, might not be the preferred method anymore.

This article aims to exist the definitive guide for reading files in Java 7, 8 and ix. I'thou going to embrace all the ways you lot tin can read files in Java. Too often, you'll read an commodity that tells you ane manner to read a file, only to discover afterward there are other ways to practise that. I'g actually going to embrace 15 different ways to read a file in Coffee. I'g going to encompass reading files in multiple ways with the cadre Java libraries likewise equally two third party libraries.

But that's not all – what good is knowing how to exercise something in multiple ways if y'all don't know which style is all-time for your state of affairs?

I also put each of these methods to a real performance examination and document the results. That style, you will take some hard data to know the performance metrics of each method.

Methodology

JDK Versions

Java code samples don't live in isolation, peculiarly when information technology comes to Java I/O, every bit the API keeps evolving. All code for this article has been tested on:

  • Java SE vii (jdk1.vii.0_80)
  • Coffee SE viii (jdk1.8.0_162)
  • Java SE nine (jdk-9.0.four)

When at that place is an incompatibility, it will exist stated in that department. Otherwise, the code works unaltered for different Coffee versions. The main incompatibility is the use of lambda expressions which was introduced in Coffee 8.

Java File Reading Libraries

In that location are multiple means of reading from files in Java. This commodity aims to be a comprehensive collection of all the different methods. I volition encompass:

  • java.io.FileReader.read()
  • java.io.BufferedReader.readLine()
  • coffee.io.FileInputStream.read()
  • java.io.BufferedInputStream.read()
  • java.nio.file.Files.readAllBytes()
  • coffee.nio.file.Files.readAllLines()
  • java.nio.file.Files.lines()
  • java.util.Scanner.nextLine()
  • org.apache.commons.io.FileUtils.readLines() – Apache Eatables
  • com.google.common.io.Files.readLines() – Google Guava

Closing File Resources

Prior to JDK7, when opening a file in Java, all file resources would demand to exist manually closed using a try-take hold of-finally cake. JDK7 introduced the try-with-resource statement, which simplifies the procedure of closing streams. Y'all no longer need to write explicit lawmaking to close streams considering the JVM will automatically close the stream for you, whether an exception occurred or not. All examples used in this article apply the try-with-resource argument for importing, loading, parsing and endmost files.

File Location

All examples will read test files from C:\temp.

Encoding

Grapheme encoding is not explicitly saved with text files so Coffee makes assumptions about the encoding when reading files. Usually, the assumption is correct only sometimes you desire to be explicit when instructing your programs to read from files. When encoding isn't correct, you'll see funny characters appear when reading files.

All examples for reading text files use ii encoding variations:
Default system encoding where no encoding is specified and explicitly setting the encoding to UTF-8.

Download Lawmaking

All code files are available from Github.

Code Quality and Code Encapsulation

In that location is a difference between writing code for your personal or piece of work project and writing code to explain and teach concepts.

If I was writing this code for my ain project, I would utilise proper object-oriented principles like encapsulation, abstraction, polymorphism, etc. Just I wanted to make each instance stand alone and hands understood, which meant that some of the lawmaking has been copied from i example to the adjacent. I did this on purpose considering I didn't want the reader to have to figure out all the encapsulation and object structures I so cleverly created. That would have away from the examples.

For the same reason, I chose NOT to write these example with a unit of measurement testing framework like JUnit or TestNG because that's not the purpose of this article. That would add another library for the reader to empathise that has nothing to practice with reading files in Java. That's why all the example are written inline inside the main method, without extra methods or classes.

My principal purpose is to brand the examples as easy to understand as possible and I believe that having extra unit testing and encapsulation code will not help with this. That doesn't mean that'south how I would encourage you lot to write your ain personal lawmaking. It'due south just the way I chose to write the examples in this commodity to make them easier to understand.

Exception Handling

All examples declare whatsoever checked exceptions in the throwing method declaration.

The purpose of this article is to show all the unlike ways to read from files in Java – information technology'south not meant to prove how to handle exceptions, which will be very specific to your situation.

So instead of creating unhelpful effort catch blocks that merely print exception stack traces and clutter up the code, all example will declare whatsoever checked exception in the calling method. This will make the lawmaking cleaner and easier to understand without sacrificing any functionality.

Future Updates

As Coffee file reading evolves, I will exist updating this article with any required changes.

File Reading Methods

I organized the file reading methods into three groups:

  • Classic I/O classes that have been role of Coffee since before JDK i.7. This includes the java.io and java.util packages.
  • New Coffee I/O classes that have been part of Java since JDK1.vii. This covers the java.nio.file.Files course.
  • Third party I/O classes from the Apache Commons and Google Guava projects.

Classic I/O – Reading Text

1a) FileReader – Default Encoding

FileReader reads in one grapheme at a time, without whatsoever buffering. It's meant for reading text files. It uses the default character encoding on your system, so I take provided examples for both the default example, as well equally specifying the encoding explicitly.

          

1
ii
three
four
5
6
vii
8
9
10
11
12
13
14
15
16
17
18
nineteen

import java.io.FileReader ;
import java.io.IOException ;

public course ReadFile_FileReader_Read {
public static void main( String [ ] pArgs) throws IOException {
String fileName = "c:\\temp\\sample-10KB.txt" ;

attempt ( FileReader fileReader = new FileReader (fileName) ) {
int singleCharInt;
char singleChar;
while ( (singleCharInt = fileReader.read ( ) ) != - 1 ) {
singleChar = ( char ) singleCharInt;

//display one graphic symbol at a time
System.out.print (singleChar) ;
}
}
}
}

1b) FileReader – Explicit Encoding (InputStreamReader)

It's really non possible to set the encoding explicitly on a FileReader then you have to utilise the parent class, InputStreamReader and wrap it around a FileInputStream:

          

1
2
3
4
five
6
7
8
9
x
11
12
thirteen
fourteen
15
16
17
18
nineteen
20
21
22

import java.io.FileInputStream ;
import java.io.IOException ;
import coffee.io.InputStreamReader ;

public class ReadFile_FileReader_Read_Encoding {
public static void primary( Cord [ ] pArgs) throws IOException {
String fileName = "c:\\temp\\sample-10KB.txt" ;
FileInputStream fileInputStream = new FileInputStream (fileName) ;

//specify UTF-8 encoding explicitly
try ( InputStreamReader inputStreamReader =
new InputStreamReader (fileInputStream, "UTF-eight" ) ) {

int singleCharInt;
char singleChar;
while ( (singleCharInt = inputStreamReader.read ( ) ) != - 1 ) {
singleChar = ( char ) singleCharInt;
Organization.out.print (singleChar) ; //brandish one character at a time
}
}
}
}

2a) BufferedReader – Default Encoding

BufferedReader reads an entire line at a time, instead of one character at a time like FileReader. It'southward meant for reading text files.

          

1
ii
3
4
5
half-dozen
7
viii
nine
10
11
12
13
14
15
16
17

import java.io.BufferedReader ;
import java.io.FileReader ;
import java.io.IOException ;

public class ReadFile_BufferedReader_ReadLine {
public static void primary( String [ ] args) throws IOException {
String fileName = "c:\\temp\\sample-10KB.txt" ;
FileReader fileReader = new FileReader (fileName) ;

try ( BufferedReader bufferedReader = new BufferedReader (fileReader) ) {
String line;
while ( (line = bufferedReader.readLine ( ) ) != null ) {
Organisation.out.println (line) ;
}
}
}
}

2b) BufferedReader – Explicit Encoding

In a similar way to how we set encoding explicitly for FileReader, we need to create FileInputStream, wrap information technology inside InputStreamReader with an explicit encoding and laissez passer that to BufferedReader:

          

1
2
iii
four
five
vi
7
8
9
x
11
12
thirteen
fourteen
15
sixteen
17
18
19
20
21
22

import java.io.BufferedReader ;
import java.io.FileInputStream ;
import java.io.IOException ;
import java.io.InputStreamReader ;

public course ReadFile_BufferedReader_ReadLine_Encoding {
public static void chief( String [ ] args) throws IOException {
String fileName = "c:\\temp\\sample-10KB.txt" ;

FileInputStream fileInputStream = new FileInputStream (fileName) ;

//specify UTF-viii encoding explicitly
InputStreamReader inputStreamReader = new InputStreamReader (fileInputStream, "UTF-8" ) ;

try ( BufferedReader bufferedReader = new BufferedReader (inputStreamReader) ) {
Cord line;
while ( (line = bufferedReader.readLine ( ) ) != nix ) {
Organisation.out.println (line) ;
}
}
}
}

Classic I/O – Reading Bytes

1) FileInputStream

FileInputStream reads in one byte at a time, without any buffering. While it'south meant for reading binary files such as images or sound files, it can still be used to read text file. It's similar to reading with FileReader in that you're reading one character at a time as an integer and you demand to cast that int to a char to see the ASCII value.

By default, it uses the default character encoding on your system, so I accept provided examples for both the default case, as well every bit specifying the encoding explicitly.

          

1
2
3
four
5
6
vii
viii
nine
10
11
12
13
xiv
15
sixteen
17
18
19
twenty
21

import java.io.File ;
import java.io.FileInputStream ;
import java.io.FileNotFoundException ;
import java.io.IOException ;

public class ReadFile_FileInputStream_Read {
public static void main( String [ ] pArgs) throws FileNotFoundException, IOException {
Cord fileName = "c:\\temp\\sample-10KB.txt" ;
File file = new File (fileName) ;

endeavour ( FileInputStream fileInputStream = new FileInputStream (file) ) {
int singleCharInt;
char singleChar;

while ( (singleCharInt = fileInputStream.read ( ) ) != - 1 ) {
singleChar = ( char ) singleCharInt;
Arrangement.out.print (singleChar) ;
}
}
}
}

ii) BufferedInputStream

BufferedInputStream reads a set up of bytes all at once into an internal byte array buffer. The buffer size can be set explicitly or apply the default, which is what we'll demonstrate in our example. The default buffer size appears to exist 8KB just I have not explicitly verified this. All performance tests used the default buffer size and then information technology will automatically re-size the buffer when it needs to.

          

ane
2
3
4
5
6
vii
8
nine
10
xi
12
13
14
15
xvi
17
eighteen
19
20
21
22

import java.io.BufferedInputStream ;
import coffee.io.File ;
import java.io.FileInputStream ;
import coffee.io.FileNotFoundException ;
import java.io.IOException ;

public class ReadFile_BufferedInputStream_Read {
public static void main( String [ ] pArgs) throws FileNotFoundException, IOException {
Cord fileName = "c:\\temp\\sample-10KB.txt" ;
File file = new File (fileName) ;
FileInputStream fileInputStream = new FileInputStream (file) ;

try ( BufferedInputStream bufferedInputStream = new BufferedInputStream (fileInputStream) ) {
int singleCharInt;
char singleChar;
while ( (singleCharInt = bufferedInputStream.read ( ) ) != - 1 ) {
singleChar = ( char ) singleCharInt;
System.out.print (singleChar) ;
}
}
}
}

New I/O – Reading Text

1a) Files.readAllLines() – Default Encoding

The Files class is part of the new Java I/O classes introduced in jdk1.7. It simply has static utility methods for working with files and directories.

The readAllLines() method that uses the default character encoding was introduced in jdk1.viii so this instance will not work in Java 7.

          

1
two
iii
4
five
6
7
8
ix
10
xi
12
13
14
15
16
17

import java.io.File ;
import java.io.IOException ;
import coffee.nio.file.Files ;
import coffee.util.List ;

public class ReadFile_Files_ReadAllLines {
public static void main( Cord [ ] pArgs) throws IOException {
String fileName = "c:\\temp\\sample-10KB.txt" ;
File file = new File (fileName) ;

List fileLinesList = Files.readAllLines (file.toPath ( ) ) ;

for ( String line : fileLinesList) {
System.out.println (line) ;
}
}
}

1b) Files.readAllLines() – Explicit Encoding

          

1
2
3
4
five
6
7
8
9
10
eleven
12
13
14
15
16
17
eighteen
xix

import java.io.File ;
import java.io.IOException ;
import java.nio.charset.StandardCharsets ;
import java.nio.file.Files ;
import java.util.List ;

public course ReadFile_Files_ReadAllLines_Encoding {
public static void master( String [ ] pArgs) throws IOException {
String fileName = "c:\\temp\\sample-10KB.txt" ;
File file = new File (fileName) ;

//use UTF-viii encoding
List fileLinesList = Files.readAllLines (file.toPath ( ), StandardCharsets.UTF_8 ) ;

for ( String line : fileLinesList) {
System.out.println (line) ;
}
}
}

2a) Files.lines() – Default Encoding

This lawmaking was tested to piece of work in Coffee 8 and 9. Coffee 7 didn't run because of the lack of support for lambda expressions.

          

one
ii
3
4
v
half-dozen
7
viii
nine
10
11
12
13
14
15
16
17

import java.io.File ;
import java.io.IOException ;
import java.nio.file.Files ;
import java.util.stream.Stream ;

public form ReadFile_Files_Lines {
public static void main( Cord [ ] pArgs) throws IOException {
String fileName = "c:\\temp\\sample-10KB.txt" ;
File file = new File (fileName) ;

try (Stream linesStream = Files.lines (file.toPath ( ) ) ) {
linesStream.forEach (line -> {
System.out.println (line) ;
} ) ;
}
}
}

2b) Files.lines() – Explicit Encoding

Just like in the previous case, this code was tested and works in Java 8 and 9 only not in Java vii.

          

1
ii
3
4
5
6
7
8
9
10
xi
12
13
14
fifteen
16
17
18

import java.io.File ;
import java.io.IOException ;
import java.nio.charset.StandardCharsets ;
import java.nio.file.Files ;
import java.util.stream.Stream ;

public grade ReadFile_Files_Lines_Encoding {
public static void master( Cord [ ] pArgs) throws IOException {
Cord fileName = "c:\\temp\\sample-10KB.txt" ;
File file = new File (fileName) ;

try (Stream linesStream = Files.lines (file.toPath ( ), StandardCharsets.UTF_8 ) ) {
linesStream.forEach (line -> {
System.out.println (line) ;
} ) ;
}
}
}

3a) Scanner – Default Encoding

The Scanner course was introduced in jdk1.vii and can exist used to read from files or from the console (user input).

          

1
two
3
4
5
6
7
8
9
10
11
12
13
14
15
xvi
17
18
19

import coffee.io.File ;
import java.io.FileNotFoundException ;
import java.util.Scanner ;

public class ReadFile_Scanner_NextLine {
public static void master( String [ ] pArgs) throws FileNotFoundException {
String fileName = "c:\\temp\\sample-10KB.txt" ;
File file = new File (fileName) ;

try (Scanner scanner = new Scanner(file) ) {
String line;
boolean hasNextLine = false ;
while (hasNextLine = scanner.hasNextLine ( ) ) {
line = scanner.nextLine ( ) ;
System.out.println (line) ;
}
}
}
}

3b) Scanner – Explicit Encoding

          

one
2
3
4
v
half dozen
7
8
9
10
11
12
thirteen
14
fifteen
16
17
18
nineteen
20

import java.io.File ;
import java.io.FileNotFoundException ;
import java.util.Scanner ;

public class ReadFile_Scanner_NextLine_Encoding {
public static void main( String [ ] pArgs) throws FileNotFoundException {
Cord fileName = "c:\\temp\\sample-10KB.txt" ;
File file = new File (fileName) ;

//utilise UTF-8 encoding
attempt (Scanner scanner = new Scanner(file, "UTF-8" ) ) {
Cord line;
boolean hasNextLine = false ;
while (hasNextLine = scanner.hasNextLine ( ) ) {
line = scanner.nextLine ( ) ;
Organisation.out.println (line) ;
}
}
}
}

New I/O – Reading Bytes

Files.readAllBytes()

Fifty-fifty though the documentation for this method states that "information technology is not intended for reading in big files" I found this to be the absolute best performing file reading method, even on files as large as 1GB.

          

1
ii
iii
4
five
half dozen
7
8
9
10
eleven
12
13
14
15
16
17

import java.io.File ;
import java.io.IOException ;
import coffee.nio.file.Files ;

public form ReadFile_Files_ReadAllBytes {
public static void main( Cord [ ] pArgs) throws IOException {
String fileName = "c:\\temp\\sample-10KB.txt" ;
File file = new File (fileName) ;

byte [ ] fileBytes = Files.readAllBytes (file.toPath ( ) ) ;
char singleChar;
for ( byte b : fileBytes) {
singleChar = ( char ) b;
System.out.print (singleChar) ;
}
}
}

third Party I/O – Reading Text

Commons – FileUtils.readLines()

Apache Commons IO is an open source Coffee library that comes with utility classes for reading and writing text and binary files. I listed it in this commodity because it tin can be used instead of the built in Java libraries. The class we're using is FileUtils.

For this article, version 2.6 was used which is uniform with JDK 1.7+

Note that you demand to explicitly specify the encoding and that method for using the default encoding has been deprecated.

          

1
2
iii
iv
v
half-dozen
seven
eight
nine
10
11
12
thirteen
14
15
16
17
eighteen

import coffee.io.File ;
import java.io.IOException ;
import java.util.List ;

import org.apache.commons.io.FileUtils ;

public class ReadFile_Commons_FileUtils_ReadLines {
public static void chief( String [ ] pArgs) throws IOException {
String fileName = "c:\\temp\\sample-10KB.txt" ;
File file = new File (fileName) ;

List fileLinesList = FileUtils.readLines (file, "UTF-8" ) ;

for ( String line : fileLinesList) {
System.out.println (line) ;
}
}
}

Guava – Files.readLines()

Google Guava is an open up source library that comes with utility classes for common tasks like collections handling, cache direction, IO operations, string processing.

I listed it in this article because it can be used instead of the built in Java libraries and I wanted to compare its performance with the Java built in libraries.

For this article, version 23.0 was used.

I'thou not going to examine all the different ways to read files with Guava, since this commodity is not meant for that. For a more detailed look at all the different means to read and write files with Guava, have a expect at Baeldung's in depth commodity.

When reading a file, Guava requires that the character encoding be gear up explicitly, only like Apache Commons.

Compatibility notation: This lawmaking was tested successfully on Java 8 and 9. I couldn't get it to work on Coffee 7 and kept getting "Unsupported major.minor version 52.0" error. Guava has a separate API physician for Java seven which uses a slightly different version of the Files.readLine() method. I thought I could get it to piece of work but I kept getting that mistake.

          

i
2
3
4
v
vi
vii
viii
9
10
11
12
13
xiv
15
sixteen
17
xviii
nineteen

import java.io.File ;
import java.io.IOException ;
import java.util.List ;

import com.google.common.base of operations.Charsets ;
import com.google.mutual.io.Files ;

public class ReadFile_Guava_Files_ReadLines {
public static void main( Cord [ ] args) throws IOException {
Cord fileName = "c:\\temp\\sample-10KB.txt" ;
File file = new File (fileName) ;

List fileLinesList = Files.readLines (file, Charsets.UTF_8 ) ;

for ( Cord line : fileLinesList) {
System.out.println (line) ;
}
}
}

Performance Testing

Since at that place are then many ways to read from a file in Java, a natural question is "What file reading method is the best for my state of affairs?" So I decided to test each of these methods confronting each other using sample data files of unlike sizes and timing the results.

Each code sample from this article displays the contents of the file to a string and then to the console (Organization.out). However, during the performance tests the System.out line was commented out since information technology would seriously slow down the functioning of each method.

Each performance examination measures the time it takes to read in the file – line by line, character by grapheme, or byte by byte without displaying anything to the console. I ran each test 5-10 times and took the boilerplate so as not to allow any outliers influence each test. I likewise ran the default encoding version of each file reading method – i.e. I didn't specify the encoding explicitly.

Dev Setup

The dev environment used for these tests:

  • Intel Cadre i7-3615 QM @ii.3 GHz, 8GB RAM
  • Windows 8 x64
  • Eclipse IDE for Java Developers, Oxygen.2 Release (4.7.2)
  • Coffee SE 9 (jdk-9.0.4)

Data Files

GitHub doesn't let pushing files larger than 100 MB, then I couldn't detect a practical way to store my large test files to permit others to replicate my tests. And so instead of storing them, I'm providing the tools I used to generate them and then you tin can create examination files that are similar in size to mine. Plain they won't be the same, only you'll generate files that are similar in size as I used in my performance tests.

Random String Generator was used to generate sample text and then I simply copy-pasted to create larger versions of the file. When the file started getting as well large to manage inside a text editor, I had to use the control line to merge multiple text files into a larger text file:

copy *.txt sample-1GB.txt

I created the following vii data file sizes to examination each file reading method beyond a range of file sizes:

  • 1KB
  • 10KB
  • 100KB
  • 1MB
  • 10MB
  • 100MB
  • 1GB

Operation Summary

There were some surprises and some expected results from the functioning tests.

As expected, the worst performers were the methods that read in a file character past graphic symbol or byte by byte. But what surprised me was that the native Java IO libraries outperformed both 3rd party libraries – Apache Commons IO and Google Guava.

What'south more – both Google Guava and Apache Eatables IO threw a coffee.lang.OutOfMemoryError when trying to read in the 1 GB examination file. This also happened with the Files.readAllLines(Path) method but the remaining 7 methods were able to read in all test files, including the 1GB test file.

The following table summarizes the boilerplate fourth dimension (in milliseconds) each file reading method took to complete. I highlighted the acme iii methods in green, the boilerplate performing methods in xanthous and the worst performing methods in ruddy:

The following chart summarizes the above table but with the following changes:

I removed java.io.FileInputStream.read() from the chart considering its operation was and so bad it would skew the unabridged chart and y'all wouldn't come across the other lines properly
I summarized the information from 1KB to 1MB because after that, the nautical chart would go too skewed with then many nether performers and besides some methods threw a java.lang.OutOfMemoryError at 1GB

The Winners

The new Java I/O libraries (coffee.nio) had the best overall winner (java.nio.Files.readAllBytes()) but it was followed closely behind by BufferedReader.readLine() which was also a proven top performer beyond the lath. The other excellent performer was java.nio.Files.lines(Path) which had slightly worse numbers for smaller test files but really excelled with the larger examination files.

The accented fastest file reader across all data tests was java.nio.Files.readAllBytes(Path). It was consistently the fastest and even reading a 1GB file only took about 1 second.

The following chart compares performance for a 100KB test file:

Y'all can see that the lowest times were for Files.readAllBytes(), BufferedInputStream.read() and BufferedReader.readLine().

The following nautical chart compares performance for reading a 10MB file. I didn't bother including the bar for FileInputStream.Read() because the operation was then bad it would skew the entire chart and you couldn't tell how the other methods performed relative to each other:

Files.readAllBytes() actually outperforms all other methods and BufferedReader.readLine() is a afar 2nd.

The Losers

Every bit expected, the absolute worst performer was coffee.io.FileInputStream.read() which was orders of magnitude slower than its rivals for most tests. FileReader.read() was as well a poor performer for the same reason – reading files byte by byte (or character past character) instead of with buffers drastically degrades performance.

Both the Apache Commons IO FileUtils.readLines() and Guava Files.readLines() crashed with an OutOfMemoryError when trying to read the 1GB test file and they were most average in performance for the remaining test files.

coffee.nio.Files.readAllLines() also crashed when trying to read the 1GB test file but it performed quite well for smaller file sizes.

Functioning Rankings

Here'due south a ranked list of how well each file reading method did, in terms of speed and handling of large files, as well equally compatibility with different Coffee versions.

Rank File Reading Method
ane java.nio.file.Files.readAllBytes()
2 java.io.BufferedFileReader.readLine()
3 java.nio.file.Files.lines()
4 java.io.BufferedInputStream.read()
5 java.util.Scanner.nextLine()
six java.nio.file.Files.readAllLines()
7 org.apache.commons.io.FileUtils.readLines()
8 com.google.mutual.io.Files.readLines()
nine java.io.FileReader.read()
10 coffee.io.FileInputStream.Read()

Conclusion

I tried to present a comprehensive set of methods for reading files in Java, both text and binary. Nosotros looked at xv dissimilar ways of reading files in Java and nosotros ran performance tests to meet which methods are the fastest.

The new Coffee IO library (coffee.nio) proved to be a great performer but so was the archetype BufferedReader.