Orange Words

About

Orange Words is a playground where I tinker with the combination of hacker news data, search, rag, and machine learning.

Data

The data includes a backfill of all past items from hacker news, as well as an ongoing sync of the latest items (every 30 minutes or so). This includes updating items after their edit window has closed. The processing also includes enriching the data, to make it more searchable or to support interesting features I want to develop.

The data is sourced from the Hacker News API:

https://github.com/HackerNews/API

Search Engine

The underlying search and data engine is a single node instance of Vespa, which runs on a robust Intel NUC server.

https://vespa.ai

Machine Learning

The language models used for the retrieval augmented generation (RAG) include a handful of interesting "open" models served by together.ai, as well as various gpt models from openai.

Web Stack

The web stack is composed of python, fastapi, htmx, _hyperscript, and tailwind.

Who

My name is cody, please feel free to reach out!

web: codycollier.com
email: cmcollier@gmail.com
x/twitter: @cmcollier
linkedin: codycollier

Change Log

Winter 2025:

Migrated from Flask to FastAPI

Summer 2024:

Improved ingestion code efficiency
Added metadata for sync tracking

Spring 2024:

Improved the search for RAG based chat
Added support for Llama 3 (via Together)
Adjusted model options and config
Misc query and latest item adjustments

Winter 2023-2024:

Added support for session model switching
Added support for models from Together.ai
Added RAG based multi-turn chat
Added RAG based Q&A with lexical search
Added support for models from OpenAI
Initial launch of public site with search