File handling
PERSEUS can save files in the database. Each file has to be linked to an object within the database. This way, you can e.g. add metadata to a file or create a logical link to an object.
Additionally, every file can hold tags. You can use them to categorize files as well as later determine their intended purpose, i.e. when you cannot rely on a consistent filename.
Add a new file
Section titled “Add a new file”To add a file to the database, use the method DatabaseManager().add_associated_file() and pass
- the related object (this object has to already exist in the database),
- the file as
BinaryIO, - the filename / identifier (optional),
- a list of tags (optional).
The following example shows how you can link a publication as a PDF to a project.
from perseus.datamanager import DatabaseManager, Project
project_object_id = "67af092f16affd02f502c903a3"
db_manager = DatabaseManager()project = db_manager.get_item(Project, oid=project_object_id)
publication = open("uploaded_publication.pdf", "rb") # important: use rb instead of r to read as BinaryIO
db_manager.add_associated_file( project, # The database item to attach the file to publication, # accepts BinaryIO (Buffered I/O) "publication.pdf", # File identifier ["publication"], # A list of tags)Get a file
Section titled “Get a file”To retrieve a file, use the method DatabaseManager().get_file(). You need to pass the object id assigned to the file
you want to fetch.
To get the object id for a file, first retrieve the object the file is associated with. Then, you can access the
attribute files on this object. This attribute holds a dictionary, using the filenames as keys and the corresponding
object id as values. Take a look at the following example where we want to fetch the file proposal.pdf for a specific
project:
...
db_manager = DatabaseManager()project = db_manager.get_item(Project, search_filter={"abbreviation": "my-project-abbreviation"})if project is not None: file_object_id = project.files["proposal.pdf"] proposal_file = db_manager.get_file(file_object_id)...DatabaseManager().get_file() returns an instance of gridfs.grid_file.GridOut, which behaves like a binary file-like
object and can therefore be read like a file object.
More information can be found here.
The following example is from the internal API router responsible for the functionality to download a file by calling a specific API endpoint:
...
db_manager = DatabaseManager()file = db_manager.get_file(ObjectId(file_id))path = ( f"{pathlib.Path(__file__).parent.resolve()}/../temp/{file_id}-{file.filename}")with open(path, "wb") as f: # important: use wb instead of w because of binary mode f.write(file.read())
...