Personal finance AI (v1) #2022

Merged
zachgoll merged 66 commits from zachgoll/ai-improvements into main 2025-03-29 01:08:22 +08:00
zachgoll commented 2025-03-24 21:21:12 +08:00 (Migrated from github.com)

This PR is a continuation of #1985 and finalizes the "V1" of the personal finance AI chat feature.

Domain overview

  • Chat - has many messages, has one "assistant"
    • Chat::Debuggable - defines "debug mode", where the chat will persist verbose debug messages to help better understand the path it took to get to its final response
  • Message - a message can be a "user message", "assistant message", or "developer message" (uses STI)
  • ToolCall - belongs to a Message and is a "subroutine" a message uses to augment its response. Tool calls are only relevant for assistant messages and are optional.
  • Assistant - owned by Chat, this represents a generic "LLM assistant" that can perform chat completions
    • Assistant::Provided is responsible for finding the correct provider for a given response (i.e. the user can select a model to use for each message)
  • Assistant::Functions - the assistant comes with a library of pre-defined "functions" that the LLM provider can call to augment chat responses. An Assistant::Functions::ConcreteFunction must provide a name, description, parameters schema, and call(params = {}) method. These are passed to and executed by the Provider.
  • Provider::ConcreteLLMProvider (e.g. Provider::OpenAI)
    • Assistant::Provideable defines the interface that all LLM providers must implement to provide completions for the assistant

Swappable LLM Providers

This first version of the chat implements a single Provider::OpenAI, which implements to the Assistant::Provideable interface.

Provider responsibilities

Each "LLM Provider" is responsible for:

  • Providing chat completions
  • Executing tool calls given the provided functions
  • Handling streaming of responses
  • Building a generic Assistant::Provideable::ChatResponse for the Assistant to use

Concrete LLM implementations

To introduce a new LLM, simply implement the interface:

class Provider::Anthropic
  include Assistant::Provideable

  def chat_response(chat_history:, model: nil, instructions: nil, functions: [])
    provider_response do
      Assistant::Provideable::ChatResponse.new(
        # Anthropic implementation
     )
    end
  end
end

This way, Assistant can easily choose different models for each chat message:

module Assistant::Provided
  extend ActiveSupport::Concern

  def get_model_provider(ai_model)
    available_providers.find { |provider| provider.supports_model?(ai_model) }
  end

  private
    def available_providers
      [ Providers.openai ].compact
    end
end

class Assistant
  def respond_to(message)

    # Assistant can easily "swap out" providers based on the user's selection for each message
    provider = get_model_provider(message.ai_model)

    # response = provider.chat_response( ... )
  end
end

Turbo frames, streams, and broadcasts

The AI chat feature is entirely contained within the global sidebar, and uses Turbo frames to load the various resource views for the Chat resource (i.e. new, show, index).

In application.html.erb layout, the chat_view_path(@chat) helper is used to determine which resource view should currently show in the chat sidebar:

  • If @chat is set and it is a persisted record, the sidebar loads the show path
  • If @chat is set and its a new record, the sidebar loads new
  • If @chat is nil, show index
def chat_view_path(chat)
  return new_chat_path if params[:chat_view] == "new"
  return chats_path if chat.nil? || params[:chat_view] == "all"

  chat.persisted? ? chat_path(chat) : new_chat_path
end
<% if Current.user.ai_enabled? %>
  <%= turbo_frame_tag chat_frame, src: chat_view_path(@chat), loading: "lazy", class: "h-full" do %>
    <div class="flex justify-center items-center h-full">
      <%= lucide_icon("loader-circle", class: "w-5 h-5 text-secondary animate-spin") %>
    </div>
  <% end %>
<% else %>
  <%= render "chats/ai_consent" %>
<% end%>

Chat state

There are two important concepts for managing the sidebar chat state:

  • "Current chat" - identified by the @chat controller instance variable
  • "Last viewed chat" - used to preserve sidebar state across page refreshes

The @chat variable is set via inheritance in ApplicationController and ChatsController. In other words, the sidebar defaults to showing the "last viewed chat" unless told otherwise by a more specific action in the inheritance hierarchy:

class ApplicationController < ActionController::Base
  before_action :set_default_chat

  private
    # By default, we show the user the last chat they interacted with
    def set_default_chat
      @last_viewed_chat = Current.user&.last_viewed_chat
      @chat = @last_viewed_chat
    end
end

class ChatsController < ApplicationController
  before_action :set_chat, only: [ :show, :edit, :update, :destroy ]

  # override application_controller default behavior of setting @chat to last viewed chat
  def index
    @chat = nil
  end

  def show
    set_last_viewed_chat(@chat)
  end

  def new
    @chat = Current.user.chats.new(title: "New chat #{Time.current.strftime("%Y-%m-%d %H:%M")}")
  end
end

Broadcasts and "thinking"

Assistant responses run in background jobs and therefore require a "thinking" indicator. The base Message model implements both create and update callbacks, which both broadcast the changes to the Chat if broadcast? returns true on the specific type.

  • All creates and updates broadcast directly to the chat
  • If the message is a user message, we request an assistant response, which will enqueue a response job
  • The job creates/updates assistant messages, which will trigger broadcasts to the chat through these callbacks
  • The chat's show action can have a ?thinking=true param to trigger the AI "thinking" message. The AssistantResponseJob is then responsible for removing that message when the response is complete (otherwise, it is just removed on the next page refresh since the param is passed in a turbo frame). See ChatsController#create and MessagesController#create where we redirect_to chat_path(@chat, thinking: true) to immediately show the "thinking" message.
# User messages have a special `request_response` hook
class UserMessage < Message
  validates :ai_model, presence: true

  after_create_commit :request_response_later

  def role
    "user"
  end

  def request_response_later
    chat.ask_assistant_later(self)
  end

  def request_response
    chat.ask_assistant(self)
  end

  private
    def broadcast?
      true
    end
end
This PR is a continuation of #1985 and finalizes the "V1" of the personal finance AI chat feature. ## Domain overview - `Chat` - has many messages, has one "assistant" - `Chat::Debuggable` - defines "debug mode", where the chat will persist verbose debug messages to help better understand the path it took to get to its final response - `Message` - a message can be a "user message", "assistant message", or "developer message" (uses STI) - `ToolCall` - belongs to a `Message` and is a "subroutine" a message _uses_ to augment its response. Tool calls are only relevant for `assistant` messages and are optional. - `Assistant` - owned by `Chat`, this represents a generic "LLM assistant" that can perform chat completions - `Assistant::Provided` is responsible for finding the correct provider for a given response (i.e. the user can select a model to use for each message) - `Assistant::Functions` - the assistant comes with a library of pre-defined "functions" that the LLM provider can call to augment chat responses. An `Assistant::Functions::ConcreteFunction` must provide a `name`, `description`, `parameters` schema, and `call(params = {})` method. These are passed to and executed by the Provider. - `Provider::ConcreteLLMProvider` (e.g. `Provider::OpenAI`) - `Assistant::Provideable` defines the interface that all LLM providers must implement to provide completions for the assistant ## Swappable LLM Providers This first version of the chat implements a single `Provider::OpenAI`, which implements to the `Assistant::Provideable` interface. ### Provider responsibilities Each "LLM Provider" is responsible for: - Providing chat completions - Executing tool calls given the provided functions - Handling streaming of responses - Building a generic `Assistant::Provideable::ChatResponse` for the `Assistant` to use ### Concrete LLM implementations To introduce a new LLM, simply implement the interface: ```rb class Provider::Anthropic include Assistant::Provideable def chat_response(chat_history:, model: nil, instructions: nil, functions: []) provider_response do Assistant::Provideable::ChatResponse.new( # Anthropic implementation ) end end end ``` This way, `Assistant` can easily choose different models for each chat message: ```rb module Assistant::Provided extend ActiveSupport::Concern def get_model_provider(ai_model) available_providers.find { |provider| provider.supports_model?(ai_model) } end private def available_providers [ Providers.openai ].compact end end class Assistant def respond_to(message) # Assistant can easily "swap out" providers based on the user's selection for each message provider = get_model_provider(message.ai_model) # response = provider.chat_response( ... ) end end ``` ## Turbo frames, streams, and broadcasts The AI chat feature is entirely contained within the global sidebar, and uses Turbo frames to load the various resource views for the `Chat` resource (i.e. `new`, `show`, `index`). In `application.html.erb` layout, the `chat_view_path(@chat)` helper is used to determine _which_ resource view should currently show in the chat sidebar: - If `@chat` is set and it is a persisted record, the sidebar loads the `show` path - If `@chat` is set and its a _new_ record, the sidebar loads `new` - If `@chat` is `nil`, show `index` ```rb def chat_view_path(chat) return new_chat_path if params[:chat_view] == "new" return chats_path if chat.nil? || params[:chat_view] == "all" chat.persisted? ? chat_path(chat) : new_chat_path end ``` ```erb <% if Current.user.ai_enabled? %> <%= turbo_frame_tag chat_frame, src: chat_view_path(@chat), loading: "lazy", class: "h-full" do %> <div class="flex justify-center items-center h-full"> <%= lucide_icon("loader-circle", class: "w-5 h-5 text-secondary animate-spin") %> </div> <% end %> <% else %> <%= render "chats/ai_consent" %> <% end%> ``` ### Chat state There are two important concepts for managing the sidebar chat state: - "Current chat" - identified by the `@chat` controller instance variable - "Last viewed chat" - used to preserve sidebar state across page refreshes The `@chat` variable is set via inheritance in `ApplicationController` and `ChatsController`. In other words, the sidebar _defaults_ to showing the "last viewed chat" unless told otherwise by a more specific action in the inheritance hierarchy: ```rb class ApplicationController < ActionController::Base before_action :set_default_chat private # By default, we show the user the last chat they interacted with def set_default_chat @last_viewed_chat = Current.user&.last_viewed_chat @chat = @last_viewed_chat end end class ChatsController < ApplicationController before_action :set_chat, only: [ :show, :edit, :update, :destroy ] # override application_controller default behavior of setting @chat to last viewed chat def index @chat = nil end def show set_last_viewed_chat(@chat) end def new @chat = Current.user.chats.new(title: "New chat #{Time.current.strftime("%Y-%m-%d %H:%M")}") end end ``` ### Broadcasts and "thinking" Assistant responses run in background jobs and therefore require a "thinking" indicator. The base `Message` model implements both `create` and `update` callbacks, which both broadcast the changes to the Chat if `broadcast?` returns true on the specific type. - All creates and updates broadcast directly to the chat - If the message is a user message, we request an assistant response, which will enqueue a response job - The job creates/updates assistant messages, which will trigger broadcasts to the chat through these callbacks - The chat's `show` action can have a `?thinking=true` param to trigger the AI "thinking" message. The `AssistantResponseJob` is then responsible for _removing_ that message when the response is complete (otherwise, it is just removed on the next page refresh since the param is passed in a turbo frame). See `ChatsController#create` and `MessagesController#create` where we `redirect_to chat_path(@chat, thinking: true)` to immediately show the "thinking" message. ```rb # User messages have a special `request_response` hook class UserMessage < Message validates :ai_model, presence: true after_create_commit :request_response_later def role "user" end def request_response_later chat.ask_assistant_later(self) end def request_response chat.ask_assistant(self) end private def broadcast? true end end ```
ZainKergaye commented 2025-06-08 12:04:30 +08:00 (Migrated from github.com)

Hey I'd love to use this and try it out! I'm self hosting now, is it possible to expose an env var to change the openai servers to an internal one? I'm specifically hosting openwebui and it should (theoretically) just be a drop in replacement.

Hey I'd love to use this and try it out! I'm self hosting now, is it possible to expose an env var to change the openai servers to an internal one? I'm specifically hosting openwebui and it should (theoretically) just be a drop in replacement.
Sign in to join this conversation.