I'm not seriously pondering a FUSE filesystem that generates file contents on demand, without storing it, so that I could use it for torture testing tools that need to handle large amounts of data.

It's been years since I implemented a FUSE filesystem, and it wasn't a good experience then.

But it would be handy if someone wrote such a thing. Hint, hint.

Show thread

@federicomena Cool. Now I just need someone to use that to write me what I want :)

@liw @federicomena Connect it to one of the deep-learning text-generation things. (Taking the filename you ask for as a prompt?)

@liw Done! Please try this proof-of-concept:

If you could write up a list of requirements, that'd be fantastic.

BTW I'd like to try Subplot sometime. Do you think this project would be a good fit? Looking at Obnam's Subplot scenarios, it should be fine, but perhaps there's some sharp edges that PlentyFS will inevitably bump into?


@minoru @federicomena This looks like an awesome start.

I think Subplot is a good match for verifying PlentyFS, and I've created a PR to add a rudimentary subplot for it. If you find it acceptable, I can work with you to capture all the requirements I have. Let me know what you think.

@liw Thanks for the pull request!

For requirements, I'd prefer to keep the discussion on GitHub so we have the record in a single place. I'm fine with one big "mind dump" issue, but one issue per point works too. I would then ask some clarifying questions, and then sum everything up in a PR to Sounds good?


@minoru @federicomena Sounds perfect. I'm a bit busy with other things today, but I created to start with.

@liw Is there any reason FIFO pipes don't work?

(I'm aware they often don't.)

@dredmorbius I'm thinking of tools like tar, which stat filesystem entries and don't read FIFOs.

Basically, I want to do something like this:

mount -o fakefs none /mnt

time tar -cf /dev/null /mnt

I can do that now with my billion-empty-files disk images, but I'd like to have non-empty files. I just don't have petabytes of disk space.

@liw How real or semantic should the data be?

On any mountpoint with -dev permissions, you could mknod numerous files of arbitrary names with /dev/zero or /dev/urandom's device numbers. The kernel will feed you bits for a while.

The mountpoint itself could be a tmpfs, as /dev generally is these days.

You'll need root perms to mount the fs and create the files in the first place, but from there it's all unprivileged.

Your consuming tools will have to terminate processing as the files themselves will never end.

Might be able to do this with symlinks as well, though I'm not sure multiple precesses reading the same device through different links works ... should for /dev/zero, nut sure w/ /dev/full

@dredmorbius If tar will read the files as if they were real regular files, it'll probably be good enough for me.

Tar wil not read device files. That approach is entirely not workable. stat(2) must tell tar it's a regular file and it must behave like one.

Sparse files all have zeroes, which isn't good enough for my purposes.

If you're willing to actually implement this, we can continue the discussion, but if you're just idly curious, I'm afraid I have other things that need my attention.

@liw Understood.

I've enogh sysadmin and scrpting chops to maybe cobble something together if I understand the requirements.

As I understand, you want a single Very Large statable regular file (not device, block, character, link, FIFO, directory, socket). Seek forward but not back. Variable data content.

I'm going to chase the idea of httpfs2 + ... probably python simplehttpserver, serving over loopback, and see where that gets me.

The httpfs tools may be crufty.

httpfs provides the filesystem sematics, python the contents.

You might wat to pursue this yourself. If the reqs are incorrect I'd appreciate a tip.

Otherwise, I'll avoid bothering you until I've got something that works or this proves a dead end.

@dredmorbius Not just one file, but a file system, actually.

@liw Thanks. A weakness of httpfs2 is that it doesn't support directory listings. 😞

There's a WebDAV FUSE module which might.

@liw Hrm ... could you loopback-mount /dev/urandom itsef as a filesystem or device?

You can create arbitrarily large sparse file and loopback-mount those as filesystems, though creating structures on them will chew up real space.

Sign in to participate in the conversation

Lars and friends