以純文字存取資料 vs 以binary存取資料

Posted on 2022-11-29

透過ASCII的編碼與解碼，每一個字元會被獨立存取，下方的number會需要 16(character數量) byte。1 個 byte 有 8 bits，然而會被這串數字使用到的值域只有10種排列組合（0～9），但總共的可能性確有 $2^8=256$ 種排列組合這麼多，比起這麼浪費空間的儲存方法，是有替代方法的。

1	number=1.34567890123456

以純文字存取資料

透過純文字的儲存方式的方式很符合直覺，輸出的檔案可以直接做閱讀，這樣的輸出方式很容易被採納。

1
2
3

file = open('data.txt', 'w')
file.write(str(number))
file.close()

檔案大小：16 byte。

在vim中開啟檔案：vim data.txt

1
2
3

1.34567890123456
~                                                                                                     
~

讀取檔案

file = open('data.txt', 'r')
contents = file.read()
file.close()
# contents=1.34567890123456

以binary存取資料

透過binary的儲存方式的方式可以對應到標準的資料格式，較為節省空間，但缺點是檔案無法直接被閱讀。

import struct
file = open('data.bin', 'wb')
file.write(bytearray(struct.pack("d", number)))
file.close()

檔案大小：8 byte。

在vim中開啟檔案：vim data.bin

1
2
3

¯<83>{<99>æ<87>õ?
~                                                                                                     
~

讀取檔案

file = open('data.bin', 'rb')
contents = struct.unpack("d", file.read())[0]
file.close()
# contents=1.34567890123456

以純文字存取資料

以binary存取資料

Lossless compression