文件系统的IO内存IOjava普通io和buffer IOjava nio 包的 ByteBufferapi使用DirectByteBuffer使用堆外内存的原因RandomAccessFile 随机读写mmap内存映射网络IO
文件系统的IO
文件系统的IO内存IO
java普通io和buffer IO
普通IO
test目录下执行脚本
./mysh 0
(0 代表走 最基本的file写的逻辑) ,同时开启另外一个shell窗口监控ll -h
生成的out.txt的文件大小增加速度,如下肉眼可见的缓慢速度(KB级别)
打开strace追踪生成的文件,找到文件最大的为主线程代码
-rw-r--r-- 1 root root 4.1K Jun 27 12:12 OSFileIO.class
-rw-r--r-- 1 root root 4.4K Jun 27 11:37 OSFileIO.java
-rwxr-xr-x 1 root root 123 Jun 27 11:11 mysh*
-rw-r--r-- 1 root root 14K Jun 27 12:12 out.7754
-rw-r--r-- 1 root root 4.4M Jun 27 12:15 out.7755
-rw-r--r-- 1 root root 1.2K Jun 27 12:12 out.7756
-rw-r--r-- 1 root root 1.3K Jun 27 12:12 out.7757
-rw-r--r-- 1 root root 1.1K Jun 27 12:12 out.7758
-rw-r--r-- 1 root root 1.4K Jun 27 12:12 out.7759
-rw-r--r-- 1 root root 506K Jun 27 12:15 out.7760
-rw-r--r-- 1 root root 41K Jun 27 12:15 out.7761
-rw-r--r-- 1 root root 1.2K Jun 27 12:12 out.7762
-rw-r--r-- 1 root root 1.4K Jun 27 12:12 out.7763
-rw-r--r-- 1 root root 1.3K Jun 27 12:12 out.7764
-rw-r--r-- 1 root root 1.2K Jun 27 12:12 out.7765
-rw-r--r-- 1 root root 41K Jun 27 12:15 out.7766
-rw-r--r-- 1 root root 12K Jun 27 12:15 out.7767
-rw-r--r-- 1 root root 13K Jun 27 12:15 out.7768
-rw-r--r-- 1 root root 1.2K Jun 27 12:12 out.7769
-rw-r--r-- 1 root root 1.2K Jun 27 12:12 out.7770
-rw-r--r-- 1 root root 794K Jun 27 12:15 out.7771
-rw-r--r-- 1 root root 1.9K Jun 27 12:15 out.7772
-rw-r--r-- 1 root root 183K Jun 27 12:15 out.txt
#主线程追踪文件最大,这里是 out.7755
vim out.7755
set nu 显示行号,发现每一次system call 会写入10个字节的数据 1307 futex(0x7f0980023928, FUTEX_WAKE_PRIVATE, 1) = 0
1308 write(4, "123456789\n", 10) = 10
1309 futex(0x7f0980023978, FUTEX_WAIT_BITSET_PRIVATE, 0, {tv_sec=12089, tv_nsec=691940400}, F UTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out)
1310 futex(0x7f0980023928, FUTEX_WAKE_PRIVATE, 1) = 0
1311 write(4, "123456789\n", 10) = 10
1312 futex(0x7f0980023978, FUTEX_WAIT_BITSET_PRIVATE, 0, {tv_sec=12089, tv_nsec=702383900}, F UTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out)
1313 futex(0x7f0980023928, FUTEX_WAKE_PRIVATE, 1) = 0
1314 write(4, "123456789\n", 10) = 10
1315 futex(0x7f0980023978, FUTEX_WAIT_BITSET_PRIVATE, 0, {tv_sec=12089, tv_nsec=712889000}, F UTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out)
1316 futex(0x7f0980023928, FUTEX_WAKE_PRIVATE, 1) = 0
1317 write(4, "123456789\n", 10) = 10
1318 futex(0x7f0980023978, FUTEX_WAIT_BITSET_PRIVATE, 0, {tv_sec=12089, tv_nsec=723286200}, F UTEX_BITSET_MATCH_ANY) = -1 ETIMEDOUT (Connection timed out)
1319 futex(0x7f0980023928, FUTEX_WAKE_PRIVATE, 1) = 0
1320 write(4, "123456789\n", 10) = 10
bufferIO
test目录下执行脚本
./mysh
1 (0 代表走 bufferIO的逻辑) ,同时开启另外一个shell窗口监控ll -h
生成的out.txt的文件大小速度明显变大(MB级别),发现系统调用一次写多8190个字节
strace结果
7420 futex(0x7f4d80023928, FUTEX_WAKE_PRIVATE, 1) = 0
7421 write(4, "123456789\n123456789\n123456789\n12"..., 8190) = 8190
总结
buffer的io将写入的内容存入数组,达到一定容量后再将这批数据,通过一次system call write 写入,而普通io是每写入一次都进行一次system call,system call需要进行用户态到内核态的切换,非常耗时,导致两者读写速度差几个数量级
java nio 包的 ByteBuffer

api使用
主要成员字段
//指针标记
private int mark = -1;
//指针的当前位置
private int position = 0;
//翻转后界限
private int limit;
//最大容量
private int capacity;
//当为堆外内存的时候,内存的地址
long address;
主要成员方法
//返回当前缓冲区的最大容量
public final int capacity() {return capacity;}
//返回当前的指针位置
public final int position() {return position;}
//返回当前的读写界限
public final int limit() {return limit;}
//标记当前指针位置
public final Buffer mark() {
mark = position;
return this;
}
//恢复当前指针位置
public final Buffer reset() {
int m = mark;
if (m < 0)
throw new InvalidMarkException();
position = m;
return this;
}
//清空缓冲区,注意这里并不会清空数据,只是将各项指标初始化,后续再写入数据就直接覆盖
public final Buffer clear() {
position = 0;
limit = capacity;
mark = -1;
return this;
}
//切换读写模式
public final Buffer flip() {
limit = position;
position = 0;
mark = -1;
return this;
}
//重新从头进行读写,初始化指针和标记位置
public final Buffer rewind() {
position = 0;
mark = -1;
return this;
}
//剩余可读可写的数量
public final int remaining() {return limit - position;}
//当前是否可读/可写
public final boolean hasRemaining() {return position < limit;}
//是不是只读的
public abstract boolean isReadOnly();
//是不是支持数组访问
public abstract boolean hasArray();
//获取当前缓存的字节数组(当hasArray返回为true的时候)
public abstract Object array();
//是不是堆外缓冲区也就是直接缓冲区
public abstract boolean isDirect();
//取消缓冲区
final void discardMark() {mark = -1;}
//压缩缓存的字节数组,并将position指向压缩后数组最后元素的下一位
public abstract ByteBuffer compact();
测试案例
@Test
public void whatByteBuffer(){
// ByteBuffer buffer = ByteBuffer.allocate(1024); 堆内内存
ByteBuffer buffer = ByteBuffer.allocateDirect(1024);//堆外内存,由Unsafe类和VM类调用JNI实现
System.out.println("postition: " + buffer.position());
System.out.println("limit: " + buffer.limit());
System.out.println("capacity: " + buffer.capacity());
System.out.println("mark: " + buffer);
buffer.put("123".getBytes());//实际存放的是"1","2","3"对应的ASCII值
System.out.println("-------------put:123......");
System.out.println("mark: " + buffer);
buffer.flip(); //读写交替
System.out.println("-------------flip......");
System.out.println("mark: " + buffer);
buffer.get();
System.out.println("-------------get......");
System.out.println("mark: " + buffer);
buffer.compact();
System.out.println("-------------compact......");
System.out.println("mark: " + buffer);
buffer.clear();
System.out.println("-------------clear......");
System.out.println("mark: " + buffer);
}
//postition: 0
limit: 1024
capacity: 1024
mark: java.nio.DirectByteBuffer[pos=0 lim=1024 cap=1024]
-------------put:123......
mark: java.nio.DirectByteBuffer[pos=3 lim=1024 cap=1024]
-------------flip......
mark: java.nio.DirectByteBuffer[pos=0 lim=3 cap=1024]
-------------get......
mark: java.nio.DirectByteBuffer[pos=1 lim=3 cap=1024]
-------------compact......
mark: java.nio.DirectByteBuffer[pos=2 lim=1024 cap=1024]
-------------clear......
mark: java.nio.DirectByteBuffer[pos=0 lim=1024 cap=1024]
ps put "123" 其实转成了对应的ASCII码存储

案例流程演示






DirectByteBuffer
ByteBuffer buffer = ByteBuffer.allocateDirect(1024)
//
public static ByteBuffer allocateDirect(int capacity) {
return new DirectByteBuffer(capacity);
}
主要通过unsafe类分配堆外内存
堆外内存存在于JVM管控之外的内存区域,Java中对堆外内存的操作,依赖于Unsafe提供的操作堆外内存的native方法。
使用堆外内存的原因
- 对垃圾回收停顿的改善。由于堆外内存是直接受操作系统管理而不是JVM,所以当我们使用堆外内存时,即可保持较小的堆内内存规模。从而在GC时减少回收停顿对于应用的影响。
- 提升程序I/O操作的性能。通常在I/O通信过程中,会存在堆内内存到堆外内存的数据拷贝操作,对于需要频繁进行内存间数据拷贝且生命周期较短的暂存数据,都建议存储到堆外内存。
// Primary constructor
//
DirectByteBuffer(int cap) { // package-private
super(-1, 0, cap, cap);
boolean pa = VM.isDirectMemoryPageAligned();
int ps = Bits.pageSize();
long size = Math.max(1L, (long)cap + (pa ? ps : 0));
Bits.reserveMemory(size, cap);
long base = 0;
try {
base = unsafe.allocateMemory(size);
} catch (OutOfMemoryError x) {
Bits.unreserveMemory(size, cap);
throw x;
}
unsafe.setMemory(base, size, (byte) 0);
if (pa && (base % ps != 0)) {
// Round up to page boundary
address = base + ps - (base & (ps - 1));
} else {
address = base;
}
cleaner = Cleaner.create(this, new Deallocator(base, size, cap));
att = null;
}
Cleaner继承自Java四大引用类型之一的虚引用
PhantomReference
(众所周知,无法通过虚引用获取与之关联的对象实例,且当对象仅被虚引用引用时,在任何发生GC的时候,其均可被回收),通常PhantomReference
与引用队列ReferenceQueue
结合使用,可以实现虚引用关联对象被垃圾回收时能够进行系统通知、资源清理等功能。如下图所示,当某个被Cleaner引用的对象将被回收时,JVM垃圾收集器会将此对象的引用放入到对象引用中的pending链表中,等待Reference-Handler
进行相关处理。其中,Reference-Handler
为一个拥有最高优先级的守护线程,会循环不断的处理pending链表中的对象引用,执行Cleaner的clean方法进行相关清理工作。
所以当
DirectByteBuffer
仅被Cleaner引用(即为虚引用)时,其可以在任意GC时段被回收。当DirectByteBuffer
实例对象被回收时,在Reference-Handler线程操作中,会调用Cleaner的clean方法根据创建Cleaner时传入的Deallocator来进行堆外内存的释放。RandomAccessFile 随机读写
RandomAccessFile既可以读取文件内容,也可以向文件输出数据。同时,RandomAccessFile支持“随机访问”的方式,程序快可以直接跳转到文件的任意地方来读写数据。
andomAccessFile允许自由定义文件记录指针,RandomAccessFile可以不从开始的地方开始输出,因此RandomAccessFile可以向已存在的文件后追加内容。如果程序需要向已存在的文件后追加内容,则应该使用RandomAccessFile。
常用方法
/**
* Returns the unique {@link java.nio.channels.FileChannel FileChannel}
* object associated with this file.
*
* <p> The {@link java.nio.channels.FileChannel#position()
* position} of the returned channel will always be equal to
* this object's file-pointer offset as returned by the {@link
* #getFilePointer getFilePointer} method. Changing this object's
* file-pointer offset, whether explicitly or by reading or writing bytes,
* will change the position of the channel, and vice versa. Changing the
* file's length via this object will change the length seen via the file
* channel, and vice versa.
*
* @return the file channel associated with this file
*
* @since 1.4
* @spec JSR-51
*/
public final FileChannel getChannel() {
synchronized (this) {
if (channel == null) {
channel = FileChannelImpl.open(fd, path, true, rw, this);
}
return channel;
}
}
/**
* Sets the file-pointer offset, measured from the beginning of this
* file, at which the next read or write occurs. The offset may be
* set beyond the end of the file. Setting the offset beyond the end
* of the file does not change the file length. The file length will
* change only by writing after the offset has been set beyond the end
* of the file.
*
* @param pos the offset position, measured in bytes from the
* beginning of the file, at which to set the file
* pointer.
* @exception IOException if {@code pos} is less than
* {@code 0} or if an I/O error occurs.
*/
public void seek(long pos) throws IOException {
if (pos < 0) {
throw new IOException("Negative seek offset");
} else {
seek0(pos);
}
}
案例
//测试文件NIO
public static void testRandomAccessFileWrite() throws Exception {
RandomAccessFile raf = new RandomAccessFile(path, "rw");
raf.write("hello world\n".getBytes());
raf.write("hello java\n".getBytes());
System.out.println("write------------");
System.in.read();
//指定离开始处偏移4位的位置写
raf.seek(4);
raf.write("ooxx".getBytes());
System.out.println("seek---------");
System.in.read();
FileChannel rafchannel = raf.getChannel();
//mmap 堆外 和文件映射的 byte not objtect
MappedByteBuffer map = rafchannel.map(FileChannel.MapMode.READ_WRITE, 0, 4096);
map.put("@@@".getBytes()); //不是系统调用 但是数据会到达 内核的pagecache
//曾经我们是需要out.write() 这样的系统调用,才能让程序的data 进入内核的pagecache
//曾经必须有用户态内核态切换
执行文件脚本
第一个read阻塞住,此时内容已经写到pagecache中
[email protected]:~/develop/test# ./mysh* 2
write------------
[email protected]:~/develop/test# cat out.txt && pcstat out.txt
hello world
hello java
+---------+----------------+------------+-----------+---------+
| Name | Size (bytes) | Pages | Cached | Percent |
|---------+----------------+------------+-----------+---------|
| out.txt | 31 | 1 | 1 | 100.000 |
+---------+----------------+------------+-----------+---------+
随便输入一行放开read阻塞
[email protected]:~/develop/test# ./mysh* 2
write------------
啊
seek---------
map--put--------
java.nio.HeapByteBuffer[pos=4096 lim=8192 cap=8192]
java.nio.HeapByteBuffer[pos=0 lim=4096 cap=8192]
@@@looxxrld
hello java
[email protected]:~/develop/test# cat out.txt && pcstat out.txt
@@@looxxshibing
hello java
+---------+----------------+------------+-----------+---------+
| Name | Size (bytes) | Pages | Cached | Percent |
|---------+----------------+------------+-----------+---------|
| out.txt | 4096 | 1 | 1 | 100.000 |
+---------+----------------+------------+-----------+---------+
mmap内存映射
上述用filechannel.map做了直接内存映射如下所示 mmap系统调用会打开一个mem的FD描述符,此时可以通过channel直接修改文件不用再走系统调用的读写操作,而是直接通过mmap的映射找到对应pagecache进行操作

Loading Comments...