[PintOS] Project 4 : File System

chohk10·2023년 5월 30일

PintOS

목록 보기

8/8

Indexed and Extensible Files

In practice, this probably means using an index structure with direct, indirect, and doubly indirect blocks. In previous semesters, most students adopted something like what you have learned as Berkeley UNIX FFS with multi-level indexing.
However, to make your life easier, we make you implement it in an easier way: FAT. You must implement FAT with given skeleton code. Your code MUST NOT contain any code for multi-level indexing (FFS in the lecture). You will receive 0pts for file growth parts.

NOTE: You can assume that the file system partition will not be larger than 8 MB. You must support files as large as the partition (minus metadata). Each inode is stored in one disk sector, limiting the number of block pointers that it can contain.

Indexing large files with FAT (File Allocation Table)

clsuter : a file was stored as a contiguous single chunk over multiple disk sectors.

For simplicity, in our skeleton code, we fixed number of sectors in a cluster as 1.
need multiple clusters for a file, so we need a data structure to index the clusters for a file in the inode.
FAT (File Allocation Table) puts the connectivity of blocks in a fixed-size File Allocation Table rather than the blocks themselves. Since FAT only contains the connectivity value rather than the actual data, its size is small enough to be cached in DRAM. As a result, we can just read corresponding entries in the table.

First of all, 6 functions in fat.c (i.e. fat_init(), fat_open(), fat_close(), fat_create(), and fat_boot_create()) initialize and format disks at the boot time, so you don't have to modify them.

👉🏻 File system init process
init.c -> main() -> filesys_init(format_filesys)

format_filesys is defined by the option given to pintos through command line
disk_get(0, 1)
inode_init() -> initiate open_inodes list : reopen 시 같은 inode에 open_cnt만 증가하고 같은 inode를 return
fat_init()
do_format()
- fat_create()
- dir_create()
- fat_close() -> disk_write()
fat_open() -> fat table 이 memory 에 올라온다

cluster_t fat_fs_init (void);

Initialize FAT file system. You have to initialize fat_length and data_start field of fat_fs. fat_length stores how many clusters in the filesystem and data_start stores in which sector we can start to store files. You may want to exploit some values stored in fat_fs->bs. Also, you may want to initialize some other useful data in this function.

void
fat_fs_init (void) {
	fat_fs->data_start = fat_fs->bs.fat_start + fat_fs->bs.fat_sectors;
    fat_fs->fat_length = fat_fs->bs.fat_sectors * DISK_SECTOR_SIZE / sizeof(cluster_t);
    fat_fs->last_clst = 0;
    lock_init(&fat_fs->write_lock);
}

last_clst 멤버의경우 용도가 명확하게 정해져 있지는 않다. 다른 팀원은 disk의 마지막 클러스터 값으로 사용한다고 하고, 나는 마지막에 추가한 클러스터로 업데이트해주어, 다음 비어있는 클러스터를 좀더 빠르게 탐색할 수 있는 용도로 써보려고 한다. (아직 구현은 안함...;;)

cluster_t fat_create_chain (cluster_t clst);

Extend a chain by appending a cluster after the cluster specified in clst (cluster indexing number). If clst is equal to zero, then create a new chain. Return the cluster number of newly allocated cluster.

/* Add a cluster to the chain.
 * If CLST is 0, start a new chain.
 * Returns 0 if fails to allocate a new cluster. */
cluster_t
fat_create_chain (cluster_t clst) {
	cluster_t new_eoc;
	
	/* Search for an empty entry */
	for (new_eoc = 2; new_eoc < fat_fs->fat_length && fat_get(new_eoc) > 0; new_eoc++);
	
	/* Fail if all entries are full */
	if (new_eoc >= fat_fs->fat_length) {
		return 0;
	}

	/* Either end of a new chain or end of an existing chain */
	fat_put(new_eoc, EOChain);

	/* When requested for a new chain */
	if (clst == 0) {
		return new_eoc;
	}

	/* Add a new cluster to an existing chain */
	cluster_t old_eoc;
	for (old_eoc = clst; fat_get(old_eoc) != EOChain; old_eoc = fat_get(old_eoc));
	fat_put(old_eoc, new_eoc);

	return new_eoc;
}

void fat_remove_chain (cluster_t clst, cluster_t pclst);

Starting from clst, remove clusters from a chain. pclst should be the direct previous cluster in the chain. This means, after the execution of this function, pclst should be the last element of the updated chain. If clst is the first element in the chain, pclst should be 0.

/* Remove the chain of clusters starting from CLST.
 * If PCLST is 0, assume CLST as the start of the chain. */
void
fat_remove_chain (cluster_t clst, cluster_t pclst) {
	if (pclst) {
		fat_put(pclst, EOChain);
	}	

	for (; pclst != EOChain; clst = fat_get(clst)) {
		pclst = clst;
		fat_put(pclst, 0);
	}
}

void fat_put (cluster_t clst, cluster_t val);

Update FAT entry pointed by cluster number clst to val. Since each entry in FAT points the next cluster in a chain (if exist; otherwise EOChain), this could be used to update connectivity.

/* Update a value in the FAT table. */
void
fat_put (cluster_t clst, cluster_t val) {
	ASSERT(clst != 0);
	fat_fs->fat[clst] = val;
}

cluster_t fat_get (cluster_t clst);

Return in which cluster number the given cluster clst points.

/* Fetch a value in the FAT table. */
cluster_t
fat_get (cluster_t clst) {
	ASSERT(clst != 0);
	return fat_fs->fat[clst];
}

disk_sector_t cluster_to_sector (cluster_t clst);

Translates a cluster number clst into the corresponding sector number and return the sector number.

/* Covert a cluster # to a sector number. */
disk_sector_t
cluster_to_sector (cluster_t clst) {
	if (clst < 1)
		return 0;
	return fat_fs->data_start + (clst - 1) * fat_fs->bs.sectors_per_cluster;
}

추가로 sector_to_cluster 구현

cluster_t
sector_to_cluster (disk_sector_t sector) {
	if (sector <= fat_fs->data_start)
		return 0;

	return sector - fat_fs->data_start;
}

👉🏻 FAT 테이블의 경우 이제까지 구현했던 것 중에서 가장 심플하고 직관적인 구조여서 그런지, 함수를 구현하고 적용하는게 생각보다 재미있었다. 시간이 좀더 있었다면 더 완벽하게 구현했을 텐데 아쉬움이 남는 것 같다.

File Growth

Implement extensible files. In the basic file system, the file size is specified when the file is created. In most modern file systems, however, a file is initially created with size 0 and is then expanded every time a write is made off the end of the file. Your file system must allow this.

User programs are allowed to seek beyond the current end-of-file (EOF). The seek itself does not extend the file. Writing at a position past EOF extends the file to the position being written, and any gap between the previous EOF and the start of the write must be filled with zeros. A read starting from a position past EOF returns no bytes.

Writing far beyond EOF can cause many blocks to be entirely zero. Some file systems allocate and write real data blocks for these implicitly zeroed blocks. Other file systems do not allocate these blocks at all until they are explicitly written. The latter file systems are said to support "sparse files." You may adopt either allocation strategy in your file system

👉🏻 쓰기를 시작하기 전에 요청하는 offset과 size를 바탕으로 추가로 필요한 만큼을 한꺼번에 할당해준 후 쓰는 방법으로 구현해보았다. 실제로 쓰기 시작하기 전에 필요한 만큼의 디스크 할당이 가능한지 먼저 확인 하는 것이 좋을 것이라 생각했기 때문이다. 하지만 이 방법을 채택하는 경우, 클러스터 별로 하나씩 쓰기를 하는 도중에 실패가 발생한다면, 할당을 했지만 쓰지 않은 채로 남겨지는 클러스터가 여러개 발생하게 된다. 클러스터에 실제로 쓰려고 할 때마다 클러스터를 추가로 할당해주는 방법도 마찬가지긴 하지만 하나만 쓰여지지 않은 채 남겨지게 되니, 할당한 클러스터를 정리해주지 않는 경우 자원 누수를 줄일 수 있다.
그러나 두가지 방법 모두 쓰기를 실패한 클러스터부터 그 뒤에 연결된 클러스터를 할당 해제해주어야 더 확실하게 자원 누수를 방지할 수 있을 것이다. 그러려면 이전 sector_idx를 저장해두었다가 fat_remove_chain 함수를 호출해야할 것 같다.
클러스터 한개 정도의 누수를 감안하고, 매번 클러스터를 할당해주는 방식과, 한꺼번에 할당해주었다가 실패한 경우 한꺼번에 해제해주는 방식 중에 조금 더 따져보고서 선정하여 추후에 수정해보려고 한다.

bool file_growth (struct inode *inode, off_t pos) {
	cluster_t cluster = sector_to_cluster(inode->data.start);
	while (pos > 0) {
		if (fat_get(cluster) == EOChain) {
			if (!fat_create_chain(cluster)) {
				return false;
			}
			inode->data.length += DISK_SECTOR_SIZE;
		}
		pos -= DISK_SECTOR_SIZE;
		cluster = fat_get(cluster);
	}
	return true;
}

/* Writes SIZE bytes from BUFFER into INODE, starting at OFFSET.
 * Returns the number of bytes actually written, which may be
 * less than SIZE if end of file is reached or an error occurs.
 * (Normally a write at end of file would extend the inode, but
 * growth is not yet implemented.) */
off_t
inode_write_at (struct inode *inode, const void *buffer_, off_t size,
		off_t offset) {
	const uint8_t *buffer = buffer_;
	off_t bytes_written = 0;
	uint8_t *bounce = NULL;

	if (inode->deny_write_cnt)
		return 0;

	/* Extend file if needed */
	if (inode->data.length < offset + size) {
		if (!file_growth(inode, offset + size)) {
			return 0;
		}
	}

	while (size > 0) {
		/* Sector to write, starting byte offset within sector. */
		disk_sector_t sector_idx = byte_to_sector (inode, offset);
		int sector_ofs = offset % DISK_SECTOR_SIZE;

		/* Bytes left in inode, bytes left in sector, lesser of the two. */
		off_t inode_left = inode_length (inode) - offset;
		int sector_left = DISK_SECTOR_SIZE - sector_ofs;
		int min_left = inode_left < sector_left ? inode_left : sector_left;

		/* Number of bytes to actually write into this sector. */
		int chunk_size = size < min_left ? size : min_left;
		if (chunk_size <= 0)
   		// fat_remove_chain(sector_to_cluster(sector_idx), 이전 클러스터);
			break;

		if (sector_ofs == 0 && chunk_size == DISK_SECTOR_SIZE) {
			/* Write full sector directly to disk. */
			disk_write (filesys_disk, sector_idx, buffer + bytes_written); 
		} else {
			/* We need a bounce buffer. */
			if (bounce == NULL) {
				bounce = malloc (DISK_SECTOR_SIZE);
				if (bounce == NULL)
   				// fat_remove_chain(sector_to_cluster(sector_idx), 이전 클러스터);
					break;
			}

			/* If the sector contains data before or after the chunk
			   we're writing, then we need to read in the sector
			   first.  Otherwise we start with a sector of all zeros. */
			if (sector_ofs > 0 || chunk_size < sector_left) 
				disk_read (filesys_disk, sector_idx, bounce);
			else
				memset (bounce, 0, DISK_SECTOR_SIZE);
			memcpy (bounce + sector_ofs, buffer + bytes_written, chunk_size);
			disk_write (filesys_disk, sector_idx, bounce); 
		}

		/* Advance. */
		size -= chunk_size;
		offset += chunk_size;
		bytes_written += chunk_size;
	}
	free (bounce);

	return bytes_written;
}

Subdirectories

Implement a hierarchical name space

Make sure that directories can expand beyond their original size just as any other file can.
The basic file system has a 14-character limit on file names. You must allow full path names to be much longer than 14 characters.

Maintain a separate current directory for each process

Set the root as the initial process's current directory. => process_init
With the fork system call, the child process inherits its parent's current directory.

Update the existing system calls

bool chdir (const char *dir);

Changes the current working directory of the process to dir, which may be relative or absolute. Returns true if successful, false on failure.

bool mkdir (const char *dir);

Creates the directory named dir, which may be relative or absolute. Returns true if successful, false on failure. Fails if dir already exists or if any directory name in dir, besides the last, does not already exist. That is, mkdir("/a/b/c") succeeds only if /a/b already exists and /a/b/c does not.

bool readdir (int fd, char *name);

Reads a directory entry from file descriptor fd, which must represent a directory. If successful, stores the null-terminated file name in name, which must have room for READDIR_MAX_LEN + 1 bytes, and returns true. If no entries are left in the directory, returns false.

. and .. should not be returned by readdir. If the directory changes while it is open, then it is acceptable for some entries not to be read at all or to be read multiple times. Otherwise, each directory entry should be read once, in any order.

READDIR_MAX_LEN is defined in lib/user/syscall.h. If your file system supports longer file names than the basic file system, you should increase this value from the default of 14.

bool isdir (int fd);

Returns true if fd represents a directory, false if it represents an ordinary file.

int inumber (int fd);

Returns the inode number of the inode associated with fd, which may represent an ordinary file or a directory.

An inode number persistently identifies a file or directory. It is unique during the files existence. In Pintos, the sector number of the inode is suitable for use as an inode number.

Soft Link

Implement the soft link machanism in pintos. Soft link (aka. Symbolic link) is a pseudo file object that refers another file or directory. This file contains information of path of pointed file in terms of absolute or relative path. Let's assume the following situation:

/
├── a
│   ├── link1 -> /file
│   │
│   └── link2 -> ../file
└── file

The soft-link named link1 in /a points /file (as an absolute path), and link2 in /a points ../file (as a relative path). Reading the link1 or link2 file is equivalent to reading /file.

int symlink (const char *target, const char *linkpath);

Creates a symbolic link named linkpath which contains the string target. On success, zero is returned. Otherwise, -1 is returned.

chohk10

이전 포스트