feat(eval): public benchmark harness (PaperAudit, LimitGen, CLAIMCHECK, MMReview, PeerQA)

April 17, 2026 ยท #153
View on GitHub
Python Difficulty: Medium

Labels

enhancement area: pipeline priority: high

Sign in required

Authenticate to use favourites & bookmarks

5