-
-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Grand refactoring of the AI features #15688
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
284 commits
Select commit
Hold shift + click to select a range
52ce8f1
Refactor chat history
InAnYan 0803345
refactor chat history + start working on UI
InAnYan 79a6df3
initroduce 1 class. idk why
InAnYan 5c7d11a
Move out database listeners
InAnYan bc5ebfe
middle work
InAnYan cb2b08d
middle work
InAnYan 55bb1aa
middle work
InAnYan ea09dc6
middle work
InAnYan a00b9ac
Start working on tasks
InAnYan 02ff6ff
middle work
InAnYan 3dc512b
Somewhat AiSummary
InAnYan 2398f4e
middle work
InAnYan edb3d4e
summary almost done
InAnYan 90a2eaa
middle work
InAnYan 67ef0a0
Finish AI summary
InAnYan 6f8c361
Refactor locations
InAnYan e40cb01
Start working on AI chat UI
InAnYan ad9757e
Use single chat history repository
InAnYan 9fefac2
Remove old classes
InAnYan 57a81de
Remove old classes x2
InAnYan 0e2e03f
Middle work
InAnYan b3cbda9
middle work
InAnYan 3e59ef9
finish chat?
InAnYan e638de3
Something works
InAnYan b346a29
bug fixing
InAnYan 37562df
fix Ai chat
InAnYan 6bb421b
Implement status window
InAnYan 1b79445
Fixings
InAnYan d9eb817
rename identifiers
InAnYan 6382b62
Make group window
InAnYan af1f4f3
Fix window close
InAnYan e7c66a0
Some arch changes
InAnYan c8a3e7d
Add ADRs
InAnYan e7ca964
Some changes
InAnYan 0969f8f
mid work
InAnYan 7e87908
Almost finished
InAnYan cd7a7ba
Add draft reqs
InAnYan a001545
fix ai chat
InAnYan de98d83
add ai OFT
InAnYan c49a904
add adr for messages
InAnYan 42f832c
Empty messages package
InAnYan 2ef6fa8
add ADR on messages
InAnYan f8ecd53
add new types
InAnYan 5c50a6d
Migrate to new type (todo: migrate messages v1->v2, show debug, expor…
InAnYan e1b6eea
feat: mid work on removing
InAnYan 248c324
refactor(ai): refactor AiDatabaseListener and AiFeature
InAnYan b5b404e
refactor(ai): remove AiTemplateKind
InAnYan 1bbbe72
refactor(ai): fix AiDatabaseListeners
InAnYan be3eb71
refactor(ai): start AiSummarizationLogic
InAnYan c0d5db0
refactor(ai): petit change
InAnYan 6e8e11a
refactor(ai): remove some user message templates
InAnYan 85b5bb9
refactor(ai): add ai library id
InAnYan 58bc248
refactor(ai): refactor ChatIdentifier with ai library id
InAnYan d402a79
refactor(ai): refactor AiSummaryIdentifier
InAnYan 810d8b0
refactor(ai): rename AI identifiers
InAnYan 639748e
refactor(ai): rename AI identifiers
InAnYan bc40a43
refactor(ai): remove AiTemplateKind.java
InAnYan 619f22c
refactor(ai): change tokenizators parameters
InAnYan eb62e20
refactor(ai): remove customimplementations package
InAnYan 2784ddb
refactor(ai): make TokenizationAiFeature
InAnYan 051a9fb
refactor(ai): change order of the responsibility chain
InAnYan fb5cc8d
refactor(ai): use string templates
InAnYan 8919106
chore(gui): remove unused file
InAnYan 1ccfc3e
fix(ai): add system message to the chat
InAnYan d856055
refactor(ai): use hash for ingested documents tracking
InAnYan c38d529
refactor(ai): add GenerateEmbeddingsAiDatabaseListener and move files
InAnYan 073f7d8
refactor(ai): add transfer of summaries
InAnYan c4550d5
refactor(ai): petit change
InAnYan e690a65
refactor(ai): simplify templates
InAnYan abfd737
refactor(ai): fix name
InAnYan f4b808d
refactor(ai): add factories
InAnYan 44e6abf
refactor(ai): refactor summarization + add RAM layer
InAnYan 1e6f8e2
refactor(ai): make migrations
InAnYan cf4ffb8
refactor(ai): remove features
InAnYan 5d5e6f2
refactor(ai): remove AiChatLogic and introduce GenerateRagResponseTask
InAnYan 4610fcb
refactor(ai): clean-up code
InAnYan d44de0b
Initial plan
Copilot 7b27d51
Initial plan
Copilot 2af3303
feat: implement AI export feature for chat and summary
Copilot d6a5a0a
fix: use correct BibEntryWriter API for BibTeX serialization
Copilot 0011b49
refactor: extract helper methods to reduce duplication in AiChatView …
Copilot c0fa43e
Implement follow-up questions feature for AI chat
Copilot 7be33e5
Fix magic number and duplicate regex pattern issues from code review
Copilot 3d058dd
Fix constant placement and remove duplicate constant for code review …
Copilot 7c2844f
Changes before error encountered
Copilot 7830a21
Address review comments: BackgroundTask for follow-up questions, View…
Copilot 6ca5f63
Update AiTab.java
InAnYan 0ff6310
Update AiTab.java
InAnYan a72ec17
refactor(ai): update embeddings to use file hash
InAnYan 87c6622
Update GenerateFollowUpQuestions.java
InAnYan 42ea039
refactor: move export logic to view models, add interfaces, add markd…
Copilot 819c2bf
refactor: improve method names per code review feedback
Copilot f92b21e
Merge branch 'refactor/ai-1' into copilot/add-follow-up-questions-fea…
InAnYan 9749dde
Merge pull request #209 from InAnYan/copilot/add-follow-up-questions-…
InAnYan 2eff7a7
refactor: add AiMetadata record, remove ExportMessage, use ChatMessag…
Copilot 2b9ef40
style: convert /// comments to standard Javadoc in AiMetadata and AiT…
Copilot b2b8df7
refactor: embed AiMetadata in AiSummary; remove AiMetadata.empty()
Copilot d8ac13c
refactor: move export responsibility to AiSummaryShowingViewModel
Copilot ac6607d
style: remove trailing blank line in AiSummaryShowingView
Copilot 0d2d661
refactor(ai): add chat in memory cache
InAnYan 7087ef3
refactor(ai): make compile
InAnYan 76991c4
refactor(ai): clean ups
InAnYan e2c3084
refactor(ai): fix migrations
InAnYan cc3316d
refactor(ai): quick fix for migrations
InAnYan d9a12e6
refactor(ai): fix
InAnYan d2ba0da
refactor(ai): fix migration
InAnYan 1e2de88
Merge branch 'refactor/ai-1' of https://github.com/InAnYan/jabref int…
Copilot db1415d
Merge pull request #208 from InAnYan/copilot/add-export-feature-ai
InAnYan 2d1d3aa
refactor(ai): fix a bit the follow up questions
InAnYan e638abd
refactor(ai): fix chat history scroll
InAnYan 1c14365
feat: redesign AI chat - move model to status window, add export ther…
Copilot 03d1071
refactor: extract formatChatModelLabel into named method in AiChatSta…
Copilot 87ba960
refactor(ai): clean quickly
InAnYan f674697
refactor: move chat model building and export to AiChatStatusViewMode…
Copilot 53e6324
Merge remote-tracking branch 'origin/refactor/ai-1' into copilot/rede…
Copilot fadb61c
refactor(ai): add embedding model download cache
InAnYan ba75fe2
Merge remote-tracking branch 'origin/refactor/ai-1' into copilot/rede…
Copilot 889fc3d
Merge pull request #211 from InAnYan/copilot/redesign-ai-chat-status-…
InAnYan 675350a
refactor(ai): clear ingested documents on embedding model change
InAnYan 4664cb1
refactor(ai): ui change
InAnYan 604b656
refactor(ai): ui change for Ai Tab
InAnYan ae3241f
refactor(ai): restore regenerate message func
InAnYan 6417ebb
refactor(ai): fix adr order
InAnYan 880c3c1
refactor(ai): fix adr order
InAnYan 1646292
refactor(ai): fix chatting requirements
InAnYan 1ba5794
refactor(ai): change future feature
InAnYan 3abb140
refactor(ai): clean and fix file hasher tests
InAnYan 31fe467
refactor(ai): remove persited file ingestor test
InAnYan 422d92e
refactor(ai): refactor plain citation parsing with llm
InAnYan 33f125f
refactor(ai): less metadata changes
InAnYan fbce38d
refactor(ai): improve comment
InAnYan e6533e5
refactor(ai): refactor EmbeddingSimilarityMetric
InAnYan 2db56d7
refactor(ai): quick modules change
InAnYan e0c7d7e
refactor(ai): remove ResolvedGroup
InAnYan dc241a1
refactor(ai): update AI docs
InAnYan d1b3862
refactor(ai): Rename PredefinedEmbeddingModel
InAnYan 4ad31ba
refactor(ai): change ChatMessage.Role
InAnYan b5a9498
refactor(ai): change ChatHistoryRecord
InAnYan a1f5b40
refactor(ai): cleanups
InAnYan e86f51a
refactor(ai): remove ListenersHelper.java
InAnYan 378958c
refactor(ai): remove CitationKeyCheck.java
InAnYan eedff09
refactor(ai): fix AiPreferences
InAnYan a7ec558
refactor(ai): revert MVStoreBase
InAnYan 4e3efc2
refactor(ai): remove BibEntryListComparatorById
InAnYan 37d6fd4
refactor(ai): do not save if deleted
InAnYan a81e2b2
refactor(ai): simplify chunked summarization logic
InAnYan e022213
refactor(ai): clean
InAnYan 937632a
refactor(ai): clean
InAnYan dc3e977
refactor(ai): refactor
InAnYan 41336bd
refactor(ai): move exporters
InAnYan 59e0037
refactor(ai): static class refactors
InAnYan c898b6c
refactor(ai): remove comment
InAnYan df3a541
refactor(ai): refactor answer engines
InAnYan 020b180
refactor(ai): remove comment
InAnYan b9550f6
refactor(ai): cleanup + ingestion
InAnYan 8aec688
refactor(ai): cleanup
InAnYan d89bc33
refactor(ai): refactor ingestion + clean ups + follow-up questions
InAnYan f189fc7
refactor(ai): refactor bindings
InAnYan 884a822
refactor(ai): simplify chat message
InAnYan ce0022d
refactor(ai): change UI ai chat
InAnYan 69c67c2
refactor(ai): remove PropertiesHelper.java
InAnYan 4ee03ea
refactor(ai): block chat history during loading
InAnYan 96893e1
refactor(ai): refactor to use one status pane
InAnYan 396376e
refactor(ai): move files + add tooltip
InAnYan f90a002
refactor(ai): refactor bindings + cleanup
InAnYan f287dcf
refactor(ai): refactor ai default preferences
InAnYan cf0eef1
refactor(ai): add API key changes in Preferences
InAnYan 5ed90c4
refactor(ai): convert javadoc to markdown
InAnYan abb2c3b
Revert "refactor(ai): convert javadoc to markdown"
InAnYan b425505
Merge branch 'main' into refactor/ai-1
InAnYan d4c21b9
refactor(ai): make it run
InAnYan 2c4547a
refactor(ai): fix + fix checkstyle
InAnYan bd4a2ca
refactor(ai): add tests
InAnYan c7a9c9f
refactor(ai): convert comments
InAnYan 8d33c1e
refactor(ai): docs
InAnYan cd5d26f
refactor(ai): update ADR 0057
InAnYan e3342a7
refactor(ai): revert checkstyle
InAnYan e8e0a28
refactor(ai): add MVStore comment
InAnYan 7f1aea2
refactor(ai): update docs
InAnYan 1ace5c9
refactor(ai): add testing resources
InAnYan 4c0c8dc
refactor(ai): update from my review
InAnYan 36fcedf
refactor(ai): add context menu for chat messages
InAnYan ad45161
refactor(ai): remove assertions
InAnYan 1203ec4
refactor(ai): just fixes
InAnYan d1b687b
refactor(ai): fix tracing
InAnYan 8ef5d11
refactor(ai): add CHANGELOG entry
InAnYan acfbfda
refactor(ai): add answer engine combobox
InAnYan 9c7a535
Merge branch 'main' into refactor/ai-1
koppor 7af78d4
docs(adr): use CUID2 for aiLibraryId
koppor 60eaa47
refactor(ai): rename vBox/buttonsVBox in AiChatMessageView
koppor ebbfe9d
undo
koppor 28dd5b0
refactor(ai): tidy AiChatMessageView
koppor a79abc0
refactor(ai): localize FileStatus and use Directories.getUserDirectory
koppor c0f9533
refactor(ai): collapse 2-value State enums to BooleanProperty
koppor d3b6947
Fix HTML block
koppor 188d7e1
Merge branch 'main' into refactor/ai-1
InAnYan 14fbd67
Merge branch 'main' into refactor/ai-1
InAnYan 9693f40
reactor(ai): fix markdown
InAnYan fefce14
reactor(ai): fix new lines at the end
InAnYan a0d172e
reactor(ai): fix from review
InAnYan 4f66e5e
reactor(ai): fix do while loop
InAnYan 3b4d23b
reactor(ai): fix new lines
InAnYan 8119660
Fix submodules
InAnYan 5bc1ab2
reactor(ai): fix code style
InAnYan 597f0fa
reactor(ai): fix annotations
InAnYan 2a3a700
Merge branch 'main' into refactor/ai-1
InAnYan 0ea7dbd
reactor(ai): fix links in markdown
InAnYan 1666477
reactor(ai): fix CHANGELOG
InAnYan c8393b4
reactor(ai): update module-info.java
InAnYan 48dafe7
reactor(ai): apply openrewrite
InAnYan 7e0048f
reactor(ai): migrate to jackson 3
InAnYan 5b6ba98
Merge branch 'main' into refactor/ai-1
InAnYan 7a3eccc
reactor(ai): fix links
InAnYan 9a328f9
reactor(ai): fix code style
InAnYan 6d971c5
reactor(ai): fix arch tests
InAnYan eb27921
reactor(ai): fix arch tests x2
InAnYan 039d829
reactor(ai): add check if a file is already ingested
InAnYan 8ad81bc
reactor(ai): fix chat migration
InAnYan 9efce94
reactor(ai): fix code style
InAnYan fcf1a08
fix(docs/requirements): add expert settings AI requirements
InAnYan accad0b
Merge branch 'main' into refactor/ai-1
InAnYan a8b1931
reactor(ai): fix module info
InAnYan 40ed7d6
Merge remote-tracking branch 'origin/refactor/ai-1' into refactor/ai-1
InAnYan d95af78
fix: fix module info
InAnYan a3c88d0
fix: fix module info
InAnYan f476826
fix: build gradle
InAnYan 23c43f9
Merge branch 'main' into refactor/ai-1
InAnYan 5dfb52e
reactor(ai): fix localization
InAnYan 2048323
Merge branch 'main' into refactor/ai-1
InAnYan 1a240b9
reactor(ai): fix tests
InAnYan cccde54
Merge branch 'main' into refactor/ai-1
InAnYan 51e2dde
Merge branch 'main' into refactor/ai-1
InAnYan 0c15308
Merge branch 'main' into refactor/ai-1
InAnYan 5965d9f
Update CHANGELOG.md
InAnYan 6397848
Merge branch 'main' into refactor/ai-1
koppor 779f265
reactor(ai): fix order of ADR
InAnYan 1c5cb1f
reactor(ai): add new good to the adr
InAnYan d41c599
reactor(ai): refine requirements
InAnYan 4c968cf
reactor(ai): refine null check
InAnYan 1d4736b
reactor(ai): add qodo comments
InAnYan 1a063fb
reactor(ai): add qodo comments
InAnYan ad6fe00
reactor(ai): fix formatting
InAnYan 00201b4
reactor(ai): fix from qodo
InAnYan fbc5ea8
Merge branch 'main' into refactor/ai-1
InAnYan 6c41ec6
Merge branch 'main' into refactor/ai-1
InAnYan 36761f4
reactor(ai): move AI adrs
InAnYan 75c4cca
Merge branch 'main' into refactor/ai-1
koppor 474e2f4
Merge branch 'main' into refactor/ai-1
koppor 1cb86ad
Merge branch 'main' into refactor/ai-1
koppor 5232fa9
Merge branch 'main' into refactor/ai-1
koppor File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,65 @@ | ||
| --- | ||
| nav_order: 0058 | ||
| parent: Decision Records | ||
| --- | ||
| # Use Deep Java Library for embeddings in AI features | ||
|
|
||
| <!-- dsn->feat~ai.answer-engines.embeddings-search~1 --> | ||
|
|
||
| ## Context and Problem Statement | ||
|
|
||
| JabRef needs to use embedding models to perform Retrieval-Augmented Generation (RAG) by generating embeddings for chunks of papers. | ||
|
|
||
| The Java AI ecosystem is not as diverse as the Python AI ecosystem, so the choice must be careful to ensure stability and ease of use for end users. | ||
|
|
||
| Which library to choose? | ||
|
|
||
| ## Decision Drivers | ||
|
|
||
| * The library should not require additional setup from the user side | ||
| * It should be cross-platform | ||
| * It should support a wide variety of model architectures | ||
| * It should have an easy-to-use API | ||
| * The request that the library makes should be known and controlled | ||
| * We should know how and where the library downloads and stores models | ||
|
|
||
| ## Considered Options | ||
|
|
||
| * LangChain4j | ||
| * ONNX Runtime | ||
| * Deep Java Library (DJL) | ||
| * DeepLearning4j | ||
|
|
||
| ## Decision Outcome | ||
|
|
||
| Chosen option: "Deep Java Library (DJL)", because it satisfies all our requirements for an all-in-one solution that handles model management and inference. | ||
|
|
||
| However, users have reported problems with the PyTorch engine integration and unstable behavior. Moreover, its API is a bit complex. | ||
|
|
||
| ### Consequences | ||
|
|
||
| * Good, because it has an API to show available models | ||
| * Good, because it handles model downloading automatically | ||
| * Neutral, because the API is complex | ||
| * Bad, because users have reported problems with the PyTorch engine integration and unstable behavior | ||
|
|
||
| ## Pros and Cons of the Options | ||
|
|
||
| ### LangChain4j | ||
|
|
||
| * Good, because it offers a high-level abstraction for LLM workflows | ||
| * Neutral, because it actually wraps other libraries like DJL or ONNX Runtime for the embeddings | ||
| * Bad, because it is a general LLM framework | ||
|
|
||
| ### ONNX Runtime | ||
|
|
||
| * Good, because it is fast and efficient | ||
| * Bad, because it is a low-level inference engine and does not provide model management or downloading features out of the box | ||
| * Bad, because it supplies all binaries for different platforms at once and also supply debugging symbols, which makes it larger than necessary (see [this issue in LangChain4j repository](https://github.com/langchain4j/langchain4j/issues/1492) and [this issue in ONNX repository](https://github.com/langchain4j/langchain4j/issues/1492)) | ||
|
|
||
| ### Deep Java Library (DJL) | ||
|
|
||
| * Good, because it supports multiple engines including PyTorch and ONNX | ||
| * Good, because it has a built-in model zoo for downloading models | ||
| * Neutral, because its API is a bit complex | ||
| * Bad, because of reported stability issues with certain engines |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,89 @@ | ||
| --- | ||
| nav_order: 0059 | ||
| parent: Decision Records | ||
| --- | ||
| # Use CUID2 for `aiLibraryId` | ||
|
|
||
| ## Context and Problem Statement | ||
|
|
||
| JabRef stores an `aiLibraryId` in the library's metadata to associate AI artifacts (chat history, summaries, embeddings) with a specific `.bib` library across launches. | ||
| The id is serialized into the `.bib` file as `@Comment{jabref-meta: aiLibraryId:<id>;}` and is therefore visible to anyone who opens the file in a text editor. | ||
| Carrying the id inside the file content (rather than keying off the file path) is what lets AI artifacts stay correlated with the library even when the user renames or moves the `.bib` file. | ||
|
|
||
| Because `.bib` files are routinely shared between researchers (e.g., via Git, email, cloud drives, supplementary material of papers), the id ends up in human-facing contexts. | ||
| A v4 UUID such as `550e8400-e29b-41d4-a716-446655440000` looks alarming or "machine-y" to a researcher who is just inspecting their references file. | ||
|
|
||
| What identifier scheme should we use for `aiLibraryId`? | ||
|
|
||
| ## Decision Drivers | ||
|
|
||
| * The id must be globally unique with negligible collision probability (multiple researchers can independently create libraries; ids must not clash when libraries are merged). | ||
| * The id must be stable across JabRef launches and cross-platform. | ||
| * The id should look reasonably unobtrusive when a researcher reads the `.bib` file in a text editor — BibTeX files are shared, and the id should not say "WTF". | ||
| * The id should be generated locally without contacting a server (consistent with [ADR-0034](0034-use-citation-key-for-grouping-chat-messages.md): no server is available). | ||
| * Prefer a modern, actively maintained scheme. | ||
|
|
||
| ## Considered Options | ||
|
|
||
| * `UUID.randomUUID()` (RFC 4122 v4 UUID). | ||
| * [CUID2](https://github.com/paralleldrive/cuid2). | ||
| * Short hash of the file path / first entry. | ||
|
|
||
| ## Decision Outcome | ||
|
|
||
| Chosen option: **CUID2**, because it offers the same collision-resistance guarantees as a v4 UUID while producing a shorter, lowercase, alphanumeric string that is far less jarring inside a shared `.bib` file. | ||
| The Java port `io.github.thibaultmeyer:cuid` is on the dependency graph, and its v2.x line implements the CUID2 specification. | ||
|
|
||
| `AiService.ensureAiLibraryIdPresent` generates the id via the CUID2 generator. | ||
| The id remains an opaque `String` from the rest of the code's perspective, so no API changes propagate beyond that call site. | ||
|
|
||
| ### Consequences | ||
|
|
||
| * Good, because the id is shorter (~24 chars instead of 36) and lowercase alphanumeric, which reads better in a shared `.bib` file. | ||
| * Good, because CUID2 is explicitly designed to be collision-resistant for horizontally-distributed generation, which matches our case (every JabRef install generates ids independently). | ||
| * Good, because CUID2 is, by design, hard to guess — slightly better than v4 UUIDs against fingerprinting if an id ever leaks into a URL or log. | ||
| * Bad, because we carry a small dependency surface compared to the JDK-builtin `UUID`. | ||
| * Bad, because CUID2 is less universally recognized than UUID — a developer encountering one for the first time may need a moment to identify the format. | ||
|
|
||
| ### Confirmation | ||
|
|
||
| The serialization round-trip tests (`BibDatabaseWriterTest.writeAiLibraryId`, `MetaDataParser`) treat the value as an opaque string and pass with a CUID2 value. | ||
| A code review of `AiService.ensureAiLibraryIdPresent` confirms the CUID2 generator is the only source of new ids. | ||
|
|
||
| ## Pros and Cons of the Options | ||
|
|
||
| ### `UUID.randomUUID()` | ||
|
|
||
| Example: `550e8400-e29b-41d4-a716-446655440000`. | ||
|
|
||
| * Good, because it is built into the JDK — no extra dependency. | ||
| * Good, because it is universally recognized. | ||
| * Neutral, because collision probability is negligible (122 random bits). | ||
| * Bad, because the canonical form (`8-4-4-4-12` hex with hyphens) is long and visually noisy in a `.bib` file shared with researchers. | ||
| * Bad, because it conveys a "this is a generated machine token" feeling that is at odds with the otherwise human-readable nature of `.bib` files. | ||
|
|
||
| ### CUID2 | ||
|
|
||
| Example: `tz4a98xxat96iws9zmbrgj3a`. | ||
|
|
||
| Java port used: [thibaultmeyer/cuid-java](https://github.com/thibaultmeyer/cuid-java). | ||
|
|
||
| * Good, because the textual form is shorter and lowercase alphanumeric, blending in with other identifiers researchers already see (citation keys, DOIs). | ||
| * Good, because the spec is explicit about collision resistance under distributed generation. | ||
| * Good, because it is a modern, actively maintained scheme (the original CUID has been deprecated in favor of CUID2). | ||
| * Good, because already used in indexing and OpenOffice integration. | ||
| * Bad, because it is one more dependency to track. | ||
| * Bad, because it is slightly less familiar to developers than UUID. | ||
|
|
||
| ### Short hash of the file path / first entry | ||
|
|
||
| Example: `a3f1c9d2` (CRC32 / truncated SHA-1 of the absolute path). | ||
|
|
||
| * Good, because it is deterministic — moving a `.bib` file would not orphan its AI artifacts. | ||
| * Bad, because it is not unique: two libraries can share a citation key, and file paths change. | ||
| * Bad, because if a user copies a library, both copies would point at the same AI artifacts — exactly what `aiLibraryId` is meant to prevent. | ||
| * Bad, because the id would change if the underlying input changes, breaking the stability requirement. | ||
|
|
||
| ## More Information | ||
|
|
||
| Implementation site: `AiService.ensureAiLibraryIdPresent` in `jablib/src/main/java/org/jabref/logic/ai/AiService.java`. | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.