Server for Generative AI Computations

    Turbocharge for content recognition of archival documents

    The problem

    All archival funds of Ukraine are scanned and stored in image, audio, and video file formats, at best, with a brief content description (annotation). However, the most valuable thing is the content

    All archival funds of Ukraine are scanned and stored in image, audio, and video file formats, at best, with a brief content description (annotation). However, the most valuable thing is the content

    To enable fast and high-quality analysis and search of information in funds by content - these images, audio, and video need to be converted into a digital format understandable by modern applications built on vector databases, which allow large language models (LLM) to work freely with this information.

    To enable fast and high-quality analysis and search of information in funds by content - these images, audio, and video need to be converted into a digital format understandable by modern applications built on vector databases, which allow large language models (LLM) to work freely with this information.

    Project Goal

    Create a powerful server for generative AI computations, which will become a permanent tool for automatic content recognition from images, audio, video, and processing of archival documents of state archives of Ukraine using large language models (LLM).

    Main server tasks

    Automatic recognition
    Continuous process of automatic content recognition from images, audio, video of existing digital funds and future digitized funds.
    Computing power
    Providing computing power for inference computations based on specialized artificial intelligence models.

    Why is this important?

    Heritage preservation
    Transforming archives into a convenient digital format for integration with modern interfaces
    Accessibility
    Simplifying document search and analysis for researchers, journalists, and the public
    Innovation
    Using LLM opens new possibilities for analysis, translation, and classification

    Current Progress

    Scanned
    150 million sheets
    Scanning Dynamics
    30+ million sheets/year
    Remaining to Scan
    approximately 800 million sheets
    Prepared for LLM Work
    0% of 150 million sheets
    Estimated scanning completion: 26+ yearsMain problem: document content remains inaccessible for search and analysis

    Technical Details

    Through testing, it was found that the simplest server with one NVIDIA L4 GPU card (approximately 2 thousand euros each) under ideal conditions can recognize 149,300 images per year... one such card can help with the most critically important documents for research, but for the existing 150 million, it would take approximately 1 thousand years to process.To reach 30 million sheets per year, we need: 30,000,000 ÷ 149,300 = 201 GPU cards. This means an investment of approximately 400,000 euros just for GPUs, not including server hardware, electricity, and maintenance.

    Timeline

    1
    Information gathering and planning
    Completed
    2
    Partnership agreements
    In progress
    3
    Software development and testing
    12 months
    4
    Server organization
    6-12 months
    5
    Support and improvement
    24 months

    Expected results

    Modern infrastructurefor digital transformation of archives
    Processing accelerationsearch and analysis of archival materials
    Heritage opennessaccessibility of Ukrainian historical heritage

    Action is needed now

    Contact us if you:
    1
    can fully or partially finance the project
    2
    can provide equipment
    3
    can optimize and/or reduce costs
    4
    can spread the word about this in your media
    Натисніть для відправки листа
    PagesHomeAboutExploreProjectsJoin
    DocsPrivacy policyTerms of useWhitepaper
    OtherJoin usBecome a partnerServer project
    Contactshello@ukraineincolor.com
    All rights reserved © Ukraine in color 2025