Skip to content

Feature Implementation Plan: Hierarchical Processing Methods

πŸ“‹ Todo Checklist

  • [ ] Update the ProcessingMethod entity to support a parent-child relationship.
  • [ ] Create a database migration to add the parent relationship and seed the top-level processing methods.
  • [ ] Update the LLM schema (coffee-bean.schema.json) to extract both the specific processing method and its parent.
  • [ ] Update the CoffeeBeanPersister to handle the new hierarchical data.
  • [ ] Update the ProcessingMethodCrudController to manage the parent relationship.
  • [ ] Update the /api/processing-methods endpoint to return the hierarchical structure and remove pagination.
  • [ ] Write unit and integration tests for all new and updated functionality.
  • [ ] Final Review and Testing

πŸ” Analysis & Investigation

Codebase Structure

  • Entity: src/Entity/ProcessingMethod.php will be modified to add a self-referencing parent relationship.
  • Migration: A new Doctrine migration will be created to apply the schema change and, importantly, to insert the top-level processing methods as defined in the GitHub issue.
  • LLM Schema: config/schemas/coffee-bean.schema.json will be updated to instruct the LLM to identify the parent category for any given processing method.
  • Persistence: src/Service/Crawler/Persistance/CoffeeBeanPersister.php will be updated to correctly associate the processing method with its parent.
  • Admin: src/Controller/Admin/ProcessingMethodCrudController.php will be modified to allow administrators to set the parent for a processing method.
  • API: src/Controller/Api/ProcessingMethodController.php and src/Repository/ProcessingMethodRepository.php will be refactored to return the data in a hierarchical format.

Current Architecture

The current system treats processing methods as a flat list. This plan will introduce a hierarchical data model. The most significant change to the architecture will be in the API response for /api/processing-methods. Instead of a paginated flat list, it will return a nested structure. This is a necessary change to properly represent the hierarchical data to the frontend.

Dependencies & Integration Points

  • Doctrine: Will be used for the self-referencing relationship and the migration.
  • LLM: The prompt engineering in the JSON schema is critical to get accurate parent classifications.
  • EasyAdminBundle: The admin interface will use an AssociationField to create a dropdown for selecting the parent method.

Considerations & Challenges

  • Pagination: As you noted, pagination of a hierarchical list is complex and not user-friendly. The plan is to remove pagination from the /api/processing-methods endpoint and return the full nested structure. The number of categories is small enough that this will not be a performance issue.
  • LLM Accuracy: The LLM's ability to correctly identify the parent category will be crucial. The prompt in the schema must be very clear.
  • Data Integrity: The migration must correctly create the parent categories. The CoffeeBeanPersister must correctly link existing and new methods to these parents.

πŸ“ Implementation Plan

Prerequisites

  • No new external dependencies are required.

Step-by-Step Implementation

  1. Update ProcessingMethod Entity

    • Files to modify: src/Entity/ProcessingMethod.php
    • Changes needed:
      • Add a parent property with a ManyToOne self-referencing relationship.
      • Add a children property with a OneToMany self-referencing relationship.
      • Ensure the parent is nullable for top-level methods.
  2. Create Migration and Seed Data

    • Files to create: A new Doctrine migration file.
    • Changes needed:
      • Generate a migration to add the parent_id column to the processing_method table.
      • In the up() method of the migration, after adding the column, insert the top-level processing methods from the GitHub issue (WASHED/WET, NATURAL/DRY, etc.).
  3. Update LLM Schema and Persistence

    • Files to modify: config/schemas/coffee-bean.schema.json, src/Service/Crawler/Persistance/CoffeeBeanPersister.php
    • coffee-bean.schema.json Changes:
      • In the processingMethods item schema, add a new property parentProcessingMethod of type string.
      • Update the description to instruct the LLM to determine the parent category from the provided list (WASHED/WET, etc.) for each processing method it finds.
    • CoffeeBeanPersister.php Changes:
      • In the logic that handles processing methods, first find or create the parent method.
      • Then, when finding or creating the specific processing method, set its parent to the one you just retrieved.
  4. Update Admin UI

    • Files to modify: src/Controller/Admin/ProcessingMethodCrudController.php
    • Changes needed:
      • In the configureFields method, add an AssociationField for the parent property.
      • This will render as a dropdown in the admin form, allowing you to assign a parent to any processing method.
  5. Update API Endpoint

    • Files to modify: src/Controller/Api/ProcessingMethodController.php, src/Repository/ProcessingMethodRepository.php
    • ProcessingMethodController.php Changes:
      • Remove the page and limit parameters from the getProcessingMethods method.
      • Call a new repository method findAllHierarchical().
      • The response will be a nested array, so no custom formatting is needed if the serialization groups are set up correctly.
      • Update the #[OA\Get] annotation to reflect the new, non-paginated, hierarchical response structure.
    • ProcessingMethodRepository.php Changes:
      • Create a new method findAllHierarchical().
      • This method will first fetch all processing methods.
      • Then, it will manually build the hierarchical array in PHP by grouping children under their parents. It should only return the top-level methods (where parent is null), with their children nested inside.

Testing Strategy

  • Unit Tests:
    • Write a unit test for the ProcessingMethodRepository::findAllHierarchical method to ensure it returns the correct nested structure.
    • Update tests for the CoffeeBeanPersister to verify that it correctly links processing methods to their parents.
  • Integration Tests:
    • Update the integration test for /api/processing-methods to assert that it returns the new hierarchical structure and no longer supports pagination.
  • Manual Testing:
    • In the admin panel, verify that you can create a new processing method and assign it a parent.
    • Trigger a crawl and check that the LLM correctly identifies a processing method and its parent, and that they are saved correctly in the database.

🎯 Success Criteria

  • The ProcessingMethod entity supports a hierarchical structure.
  • The database is seeded with the correct top-level processing methods.
  • The LLM extraction process correctly identifies and links processing methods to their parents.
  • The admin UI allows for managing the parent-child relationships.
  • The /api/processing-methods endpoint returns a nested, hierarchical list of all processing methods.
  • The new functionality is fully covered by tests.

GitHub Issue

Currently, we have many different types of fermentation, often probably overlapping even though we haven't even crawled that many beans yet. We should come up with a way to group them together comprehensively, without giving the users every single option to filter by.

Current values (August 2nd):

name β”‚ description ───────────────────────────┼────────────────────────────────────────────────────────────────────────────────────── Combined Fermentation β”‚ Processing using combined fermentation method. Depulped β”‚ Cherries were depulped before fermentation. Dry Processing β”‚ Coffee cherries are carefully handpicked based on ripeness and dried on raised beds… Kenya Washed β”‚ In the Kenya process, coffee cherries are sorted, depulped, fermented in ceramic ta… Lactic Washed β”‚ A nuanced limited oxygen pre-fermentation in a sealed tank where bacteria and yeast… Layered Fermentation β”‚ Coffee is fermented in layers added over multiple days without water to control pH … Natural β”‚ Coffee cherries are dried and fermented inside the whole cherry, which imparts frui… Natural Decaffeination β”‚ Natural caffeine extraction using a solvent derived from coffee pulp over three cyc… Parabolic Bed Dried β”‚ The coffee is dried on parabolic drying beds, which allow for controlled drying con… Patio Dried β”‚ Coffee cherries are dried on patios with careful temperature control and frequent t… Raised Bed Dried β”‚ Coffee is dried on raised African-style beds allowing controlled drying and airflow… Sugar-Cane Decaf β”‚ Decaffeination is done using a product from sugar-cane fermentation, without chemic… Aerobic Fermentation β”‚ Anaerobic Natural β”‚ This coffee undergoes 36 hours of limited oxygen fermentation inside the fruit and … Double Fermentation β”‚ Cherries undergo a first anaerobic fermentation for 48 hours at 18Β°C, followed by p… Anaerobic Washed β”‚ Coffee cherries undergo pre-fermentation in cherry for 48 hours, are depulped and k… Bio-Innovation Natural β”‚ Starts with a 100-hour limited oxygen fermentation followed by depulping and washin… Biochar Raised-Bed Dried β”‚ The coffee is dried on raised beds enhanced with biochar to aid fermentation and dr… Bioinnovation Washed β”‚ Process starts with 100-hour limited oxygen fermentation in cherry form, followed b… Cold Brew β”‚ Cold brew coffee refers to coffee brewed with cold water over an extended period ra… Red Honey β”‚ The coffee cherries are depulped but not fermented in tanks; the mucilage is left o… Swiss Water Decaffeinated β”‚ A decaffeination method where green coffee extract is used to remove caffeine witho… Thermal-Shock Washed β”‚ Double fermentation thermal shock process involving extended fermentations in seale… Tyoxidator β”‚ Ripe cherries undergo an oxidative pre-fermentation before being pulped and ferment… Washed β”‚ Coffee cherries are depulped and fermented to remove the mucilage, then washed with… Wet Washed β”‚ The coffee cherries are depulped, fermented to remove mucilage, washed and then dri… White-Honey β”‚ This coffee is processed using the White-Honey method, including patio drying with … Yellow Diamond Process β”‚ A honey-process coffee method involving depulping and placing coffee in large piles… Yellow Honey β”‚ Coffee is depulped, then laid out on drying patios with mucilage left on the seed, … Dry Washed β”‚ Depulped and fermented dry for 24 hours, followed by channel washing, then dried ap… EA Decaf β”‚ Sugar cane ethyl acetate decaffeination extracts caffeine at low pressure and tempe… Honey β”‚ Coffee cherries are depulped but not fermented in tanks; the seeds dry with mucilag… Honey Processed β”‚ Coffee cherries are depulped but mucilage is left on the seeds during drying, allow… Unknown β”‚ Anaerobic Fermentation β”‚ The coffee cherries are dry-fermented and anaerobically fermented for 90 hours, the… Bioinnovation Natural β”‚ Limited oxygen pre-fermentation in a sealed tank with increased sodium inoculation … Carbonic Maceration β”‚ Inmaculada Eugenioides undergoes an 8 day carbonic maceration fermentation limiting…

waytocoffee (Member) β€’ 7d β€’ Edited β€’ Newest comment

In bold you have the first category and underneath the subcategories. In the first instance, users should only be able to choose betweem washed/wet process. natural/dry process, honey/pulped natural process, and experimental/innovative process.

WASHED/WET PROCESS

Washed Wet Washed Kenya Washed Dry Washed Double Fermentation Thermal-Shock Washed Lactic Washed Anaerobic Washed Bioinnovation Washed

NATURAL/DRY PROCESS

Natural Dry Processing Anaerobic Natural Bio-Innovation Natural Bioinnovation Natural

HONEY/PULPED NATURAL

Honey Honey Processed Red Honey Yellow Honey White-Honey Yellow Diamond Process

EXPERIMENTAL/INNOVATIVE

All anaerobic variations, Carbonic maceration, Thermal shock methods, Bio-innovation methods
Other experimental techniques

Anaerobic Fermentation Anaerobic Slow Dry Aerobic Fermentation Carbonic Maceration Tyoxidator Layered Fermentation Combined Fermentation Inoculated Multi-Cultured Yeasts Double Anaerobic Inoculated Yeast

DECAFFEINATION METHODS

Natural Decaffeination Swiss Water Decaffeinated EA Decaf (Ethyl Acetate) Sugar-Cane Decaf

Data Cleanup needed

Depulped

DRYING METHODS (Either as secondary filter or leave it out as its not technically a processing method but just describes the drying aspect)

Raised Bed Dried Patio Dried Parabolic Bed Dried Biochar Raised-Bed Dried

View this issue on GitHub: https://github.com/thewaytocoffee/bean.business/issues/2