第十三章------虚拟文件系统

虚拟文件系统是VFS，内核提供了文件和文件相关的接口。系统中所有文件不但依赖VFS共存，还依赖VFS协调工作。

通用的文件接口，包括read write和open这一类接口，我们可以使用这些接口进行文件、硬盘读取。

之所以能够使用统一的接口对文件进行读写，是因为linux提供了统一的抽象层

1.unix文件系统

unix使用了4种抽象概念：文件、目录、索引点

2.VFS对象以及数据结构

VFS 主要有四个对象，他们分别是

超级块对象

索引节点对象

目录对象

文件对象

目录是另一种形式的文件，VFS把目录当成文件来处理

四个对象分别为：

super_operation 对象，其中包括特定文件所能调用的方法

inode_operation对象，其中包括内核针对特定文件所能调用的方法,包括create和link

dentry_operation对象，其中包括特定目录能调用的方法，比如d_compare和d_delete

file_operation 对象，针对已经打开的文件所能调用的方法，read 或者write open等。

3.超级块对象

各种文件系统都必须实现超级块对象，该对象用于存储特定文件系统信息，通常对应与存放在特定磁盘待定扇区中的文件系统超级块或文件系统控制块。对于并非基于磁盘的文件系统，他们会在现场创建超级块并放在内存中

struct super_block {
        struct list_head        s_list;         /* Keep this first */
        dev_t                   s_dev;          /* search index; _not_ kdev_t */
        unsigned char           s_blocksize_bits;
        unsigned long           s_blocksize;
        loff_t                  s_maxbytes;     /* Max file size */
        struct file_system_type *s_type;
        const struct super_operations   *s_op;
        const struct dquot_operations   *dq_op;
        const struct quotactl_ops       *s_qcop;
        const struct export_operations *s_export_op;
        unsigned long           s_flags;
        unsigned long           s_iflags;       /* internal SB_I_* flags */
        unsigned long           s_magic;
        struct dentry           *s_root;
        struct rw_semaphore     s_umount;
        int                     s_count;
        atomic_t                s_active;
#ifdef CONFIG_SECURITY
        void                    *s_security;
#endif
        const struct xattr_handler **s_xattr;
#ifdef CONFIG_FS_ENCRYPTION
        const struct fscrypt_operations *s_cop;
        struct key              *s_master_keys; /* master crypto keys in use */
#endif
#ifdef CONFIG_FS_VERITY
        const struct fsverity_operations *s_vop;
#endif
#ifdef CONFIG_UNICODE
        struct unicode_map *s_encoding;
        __u16 s_encoding_flags;
#endif
        struct hlist_bl_head    s_roots;        /* alternate root dentries for NFS */
        struct list_head        s_mounts;       /* list of mounts; _not_ for fs use */
        struct block_device     *s_bdev;
 struct backing_dev_info *s_bdi;
        struct mtd_info         *s_mtd;
        struct hlist_node       s_instances;
        unsigned int            s_quota_types;  /* Bitmask of supported quota types */
        struct quota_info       s_dquot;        /* Diskquota specific options */

        struct sb_writers       s_writers;

        /*
         * Keep s_fs_info, s_time_gran, s_fsnotify_mask, and
         * s_fsnotify_marks together for cache efficiency. They are frequently
         * accessed and rarely modified.
         */
        void                    *s_fs_info;     /* Filesystem private info */

        /* Granularity of c/m/atime in ns (cannot be worse than a second) */
        u32                     s_time_gran;
        /* Time limits for c/m/atime in seconds */
        time64_t                   s_time_min;
        time64_t                   s_time_max;
#ifdef CONFIG_FSNOTIFY
        __u32                   s_fsnotify_mask;
        struct fsnotify_mark_connector __rcu    *s_fsnotify_marks;
#endif

        char                    s_id[32];       /* Informational name */
        uuid_t                  s_uuid;         /* UUID */

        unsigned int            s_max_links;
        fmode_t                 s_mode;

        /*
         * The next field is for VFS *only*. No filesystems have any business
         * even looking at it. You had been warned.
         */
        struct mutex s_vfs_rename_mutex;        /* Kludge */

        /*
         * Filesystem subtype.  If non-empty the filesystem type field
         * in /proc/mounts will be "type.subtype"
         */
/*
         * Filesystem subtype.  If non-empty the filesystem type field
         * in /proc/mounts will be "type.subtype"
         */
        const char *s_subtype;

        const struct dentry_operations *s_d_op; /* default d_op for dentries */

        /*
         * Saved pool identifier for cleancache (-1 means none)
         */
        int cleancache_poolid;

        struct shrinker s_shrink;       /* per-sb shrinker handle */

        /* Number of inodes with nlink == 0 but still referenced */
        atomic_long_t s_remove_count;

        /* Pending fsnotify inode refs */
        atomic_long_t s_fsnotify_inode_refs;

        /* Being remounted read-only */
        int s_readonly_remount;

        /* per-sb errseq_t for reporting writeback errors via syncfs */
        errseq_t s_wb_err;

        /* AIO completions deferred from interrupt context */
        struct workqueue_struct *s_dio_done_wq;
        struct hlist_head s_pins;

        /*
         * Owning user namespace and default context in which to
         * interpret filesystem uids, gids, quotas, device nodes,
         * xattrs and security labels.
         */
        struct user_namespace *s_user_ns;

        /*
         * The list_lru structure is essentially just a pointer to a table
         * of per-node lru lists, each of which has its own spinlock.
         * There is no need to put them into separate cachelines.
         */
        struct list_lru         s_dentry_lru;
        /*
         * The list_lru structure is essentially just a pointer to a table
         * of per-node lru lists, each of which has its own spinlock.
         * There is no need to put them into separate cachelines.
         */
        struct list_lru         s_dentry_lru;
        struct list_lru         s_inode_lru;
        struct rcu_head         rcu;
        struct work_struct      destroy_work;

        struct mutex            s_sync_lock;    /* sync serialisation lock */

        /*
         * Indicates how deep in a filesystem stack this SB is
         */
        int s_stack_depth;

        /* s_inode_list_lock protects s_inodes */
        spinlock_t              s_inode_list_lock ____cacheline_aligned_in_smp;
        struct list_head        s_inodes;       /* all inodes */

        spinlock_t              s_inode_wblist_lock;
        struct list_head        s_inodes_wb;    /* writeback inodes */
} __randomize_layout;

超级块操作：

如果一个系统要写自己的超级块，则需要调用：

        sb->s_op->write_super(sb);

看到这里我会思考，什么情况下会写超级块

其中操作方法 s_op 对每个文件系统来说，是非常重要的，它指向该超级块的操作函数表，包含一系列操作方法的实现，这些方法有：

分配inode
销毁inode
读、写inode
文件同步
等等

下面看一下超级块的用法

struct inode *alloc_inode(struct_block* sn);

在给定的超级块下创建和初始化一个新的索引节点对象

释放给定的索引点

void destroy_inode(struct inode *inode);

索引节点脏的时候会调用这个函数

void dirty_inode(struct inode* inode);

给定的索引点写入磁盘

void write_inode(struct inode* inode, int wait);

最后一个索引点被释放后会触发这个函数

drop_inode(struct inode* inode);

从磁盘上删除索引点

delete_inode(struct inode* inode)

给定的超级块更新磁盘上的超级块,调用者必须一致持有s_lock锁

void write_super(struct super_blokck* sb);

使文件系统的数据源和磁盘上的文件系统同步，wait参数指定是否要同步

void sync_fs(struct super_block* sb, int wait);

首先禁止对文件系统做改变，再使用给定的超级块更新磁盘上的超级块

void write_super_lockfs(struct super_block *sb);

对文件系统解除锁定，他是write_super_lockfs的逆操作

void unlokcfs(struct super_block *sb);

通过调用这个函数来获取文件状态。指定文件系统中相关的统计信息放到statfs中。

int statfs(struct super_block* sb, int *flags, char* data);

重新安装文件系统时候会调用这个函数

int remount_fs(struct super_block* sb, int *flags, char* data);

调用这个函数释放索引点

void clear_inode(struct inode* inode);

调用这个函数终端安装操作，这个函数一般被网络程序使用

umount_begin(struct super_block *sb);

以上所有函数都由VFS在进程上下文中调用

思索，我在平时工作中好像并没有接触到超级块，他是如何伴随着我们的日常的?linux超级块的全貌是什么样子的？

我认为其实它也是块设备的一种

首先要知道超级块是什么？

超级块是元数据的一部分，其中包含有关块设备上文件系统的信息。 Superblock提供以下有关文件，二进制文件，dll，元数据等的信息。

1)超级块，文件系统中第一个块被称为超级块。这个块存放文件系统本身的结构信息。比如，超级块记录了每个区域的大小，超级块也存放未被使用的磁盘块的信息。

Linux中存在着很多的文件系统，比如常见的ext2，ext3，ext4，sysfs，rootf，proc等等，一个超级块其实就对应一个独立文件系统

超级块有什么作用？

每个文件系统都有一个超级块结构，每个超级块都要链接到一个超级块链表。而文件系统内的每个文件在打开时都需要在内存分配一个inode结构，这些inode结构都要链接到超级块。

看完了我觉得linux超级块像windows的c盘，不过linux比较隐晦，不会暴漏给用户这些底层概念

4.索引节点对象

索引节点对象包含了内核在操作文件或者目录时候的全部信息，如果没有这些信息，那么不管这些信息在磁盘上怎么存放的，必须从磁盘读取所有信息-------也就是说这些信息记录在inode上，可以加快读取，直接去内存读，而不用去磁盘读

i_bdev指向的是块设备结构体

c_dev 指向的就是字符串设备结构体

索引点操作对象

操作函数如下：

VFS通过系统调用create、open来调用这个函数，从而为dentry对象创建一个索引节点。创建的时候使用mode的初始模式

int create(struct inode* dir, struct dentry* dentry, int mode)

在特定目录寻找索引节点，这个索引几点对应dentry中的文件名

struct dentry * lookup(struct inode* dir, struct dentry * dentry)

还有很多我们常见的

link、unlink、symlink、mkdir、rmdir、readlink 就不多说了

mknod是一个有意思的命令

mknod 命令建立一个目录项和一个特殊文件的对应索引节点。

mknod 还可以创建设备文件

我看到这里有点乱，索引节点对象到底是做什么的是跟目录有关还是文件？

根据百度百科，提取一些关键介绍

linux文件系统将文件索引节点号和文件名同时保存在目录中。

所以，目录只是将文件的名称和它的索引节点号结合在一起的一张表，目录中每一对文件名称和索引节点号称为一个连接。对于一个文件来说有唯一的索引节点号与之对应，对于一个索引节点号，却可以有多个文件名与之对应。因此，在磁盘上的同一个文件可以通过不同的路径去访问它。

我们先看第一个data 目录

zhanglei@ubuntu:~$ stat data
  File: data
  Size: 4096      	Blocks: 8          IO Block: 4096   directory
Device: 805h/2053d	Inode: 11536259    Links: 32
Access: (0775/drwxrwxr-x)  Uid: ( 1000/zhanglei)   Gid: ( 1000/zhanglei)
Access: 2021-12-30 16:40:23.940211255 +0800
Modify: 2021-12-29 10:27:41.774194959 +0800
Change: 2021-12-29 10:27:41.774194959 +0800
 Birth: -

再看data目录下的index.html和test.log

zhanglei@ubuntu:~/data$ stat index.html 
  File: index.html
  Size: 14848     	Blocks: 32         IO Block: 4096   regular file
Device: 805h/2053d	Inode: 11535969    Links: 1
Access: (0664/-rw-rw-r--)  Uid: ( 1000/zhanglei)   Gid: ( 1000/zhanglei)
Access: 2021-12-27 16:41:13.534683164 +0800
Modify: 2021-11-04 11:46:49.090311376 +0800
Change: 2021-11-04 11:46:49.090311376 +0800
 Birth: -


zhanglei@ubuntu:~/data$ stat test.log 
  File: test.log
  Size: 2124      	Blocks: 8          IO Block: 4096   regular file
Device: 805h/2053d	Inode: 11535967    Links: 1
Access: (0664/-rw-rw-r--)  Uid: ( 1000/zhanglei)   Gid: ( 1000/zhanglei)
Access: 2021-11-02 19:30:36.301627388 +0800
Modify: 2021-11-02 19:30:36.209629723 +0800
Change: 2021-11-02 19:30:36.209629723 +0800
 Birth: -

这就很好解释了，所以，目录只是将文件的名称和它的索引节点号结合在一起的一张表，每一个文件都有自己的node id号