Data provider simplification, tests, and documentation #1997

Merged
zachgoll merged 12 commits from zachgoll/data-provider-concepts into main 2025-03-17 23:54:53 +08:00
zachgoll commented 2025-03-17 22:16:23 +08:00 (Migrated from github.com)

This PR simplifies and consolidates 3rd party data access patterns throughout the app. It prioritizes:

  • Simplicity
  • "Fail fast"
  • Graceful handling of missing provider configurations (for self hosted apps)

Before / after

Providers are now much clearer and simpler to write.

Provider::Synth before (example)

class Provider::Synth
  include Retryable

  def fetch_exchange_rate(from:, to:, date:)
    retrying Provider::Base.known_transient_errors do |on_last_attempt|
      response = client.get("#{base_url}/rates/historical") do |req|
        req.params["date"] = date.to_s
        req.params["from"] = from
        req.params["to"] = to
      end

      if response.success?
        ExchangeRateResponse.new \
          rate: JSON.parse(response.body).dig("data", "rates", to),
          success?: true,
          raw_response: response
      else
        if on_last_attempt
          ExchangeRateResponse.new \
            success?: false,
            error: build_error(response),
            raw_response: response
        else
          raise build_error(response)
        end
      end
    end
  end

  private
    attr_reader :api_key

    SecurityPriceResponse = Struct.new :prices, :success?, :error, :raw_response, keyword_init: true

    def build_error(response)
      Provider::Base::ProviderError.new(<<~ERROR)
        Failed to fetch data from #{self.class}
          Status: #{response.status}
          Body: #{response.body.inspect}
      ERROR
    end
end

Provider::Synth after

class Provider::Synth < Provider
  include ExchangeRate::Provideable

  def fetch_exchange_rate(from:, to:, date:)
    provider_response retries: 2 do
      response = client.get("#{base_url}/rates/historical") do |req|
        req.params["date"] = date.to_s
        req.params["from"] = from
        req.params["to"] = to
      end

      rates = JSON.parse(response.body).dig("data", "rates")

      ExchangeRate::Provideable::FetchRateData.new(
        rate: ExchangeRate.new(
          from_currency: from,
          to_currency: to,
          date: date,
          rate: rates.dig(to)
        )
      )
    end
  end
end

Data Providers

The Maybe app utilizes several 3rd party data services to calculate historical account balances, enrich data, and more. Since the app can be run in both "hosted" and "self hosted" mode, this means that data providers are optional for self hosted users and must be configured.

Because of this optionality, data providers must be configured at runtime through the @providers.rb module, utilizing @setting.rb for runtime parameters like API keys:

module Providers
  module_function

  def synth
    api_key = ENV.fetch("SYNTH_API_KEY", Setting.synth_api_key)

    return nil unless api_key.present?

    Provider::Synth.new(api_key)
  end
end

There are two types of 3rd party data in the Maybe app:

  1. "Concept" data
  2. One-off data

"Concept" data

Since the app is self hostable, users may prefer using different providers for generic data like exchange rates and security prices. When data is generic enough where we can easily swap out different providers, we call it a data "concept".

Each "concept" must have a Provideable concern that defines the methods that must be implemented along with the data shapes that are returned. For example, an "exchange rates concept" might look like this:

app/models/
  exchange_rate.rb # <- ActiveRecord model and "concept"
  exchange_rate/
    provided.rb # <- Chooses the provider for this concept based on user settings / config
    provideable.rb # <- Defines interface for providing exchange rates
  provider.rb # <- Base provider class
  provider/
    synth.rb # <- Concrete provider implementation

Where the Provideable and concrete provider implementations would be something like:

# Defines the interface an exchange rate provider must implement
module ExchangeRate::Provideable
  extend ActiveSupport::Concern

  FetchRateData = Data.define(:rate)
  FetchRatesData = Data.define(:rates)

  def fetch_exchange_rate(from:, to:, date:)
    raise NotImplementedError, "Subclasses must implement #fetch_exchange_rate"
  end

  def fetch_exchange_rates(from:, to:, start_date:, end_date:)
    raise NotImplementedError, "Subclasses must implement #fetch_exchange_rates"
  end
end

Any provider that is a valid exchange rate provider must implement this interface:

class ConcreteProvider < Provider 
  include ExchangeRate::Provideable

  def fetch_exchange_rate(from:, to:, date:)
    provider_response do 
      ExchangeRate::Provideable::FetchRateData.new(
        rate: ExchangeRate.new # build response
      )
    end
  end

  def fetch_exchange_rates(from:, to:, start_date:, end_date:)
    # Implementation
  end
end

One-off data

For data that does not fit neatly into a "concept", a Provideable is not required and the concrete provider may implement ad-hoc methods called directly in code. For example, the @synth.rb provider has a usage method that is only applicable to this specific provider. This should be called directly without any abstractions:

class SomeModel < Application
  def synth_usage
    Providers.synth.usage
  end
end

"Provided" Concerns

In general, domain models should not be calling Providers.some_provider directly. When 3rd party data is required for a domain model, we use the Provided concern within that model's namespace. This concern is primarily responsible for:

  • Choosing the provider to use for this "concept"
  • Providing convenience methods on the model for accessing data

For example, @exchange_rate.rb has a @provided.rb concern with the following convenience methods:

module ExchangeRate::Provided
  extend ActiveSupport::Concern

  class_methods do
    def provider
      Providers.synth
    end

    def find_or_fetch_rate(from:, to:, date: Date.current, cache: true)
      # Implementation 
    end

    def sync_provider_rates(from:, to:, start_date:, end_date: Date.current)
      # Implementation 
    end
  end
end

This exposes a generic access pattern where the caller does not care which provider has been chosen for the concept of exchange rates and can get a predictable response:

def access_patterns_example
  # Call exchange rate provider directly
  ExchangeRate.provider.fetch_exchange_rate(from: "USD", to: "CAD", date: Date.current)

  # Call convenience method
  ExchangeRate.sync_provider_rates(from: "USD", to: "CAD", start_date: 2.days.ago.to_date)
end

Concrete provider implementations

Each 3rd party data provider should have a class under the Provider:: namespace that inherits from Provider and returns provider_response, which will return a Provider::ProviderResponse object:

class ConcreteProvider < Provider
  def fetch_some_data
    provider_response do
      ExampleData.new(
        example: "data"
      )
    end
  end
end

The provider_response automatically catches provider errors, so concrete provider classes should raise when valid data is not possible:

class ConcreteProvider < Provider
  def fetch_some_data
    provider_response do
      data = nil

      # Raise an error if data cannot be returned
      raise ProviderError.new("Could not find the data you need") if data.nil?

      data
    end
  end
end
This PR simplifies and consolidates 3rd party data access patterns throughout the app. It prioritizes: - Simplicity - "Fail fast" - Graceful handling of missing provider configurations (for self hosted apps) ## Before / after Providers are now much clearer and simpler to write. ### `Provider::Synth` before (example) ```rb class Provider::Synth include Retryable def fetch_exchange_rate(from:, to:, date:) retrying Provider::Base.known_transient_errors do |on_last_attempt| response = client.get("#{base_url}/rates/historical") do |req| req.params["date"] = date.to_s req.params["from"] = from req.params["to"] = to end if response.success? ExchangeRateResponse.new \ rate: JSON.parse(response.body).dig("data", "rates", to), success?: true, raw_response: response else if on_last_attempt ExchangeRateResponse.new \ success?: false, error: build_error(response), raw_response: response else raise build_error(response) end end end end private attr_reader :api_key SecurityPriceResponse = Struct.new :prices, :success?, :error, :raw_response, keyword_init: true def build_error(response) Provider::Base::ProviderError.new(<<~ERROR) Failed to fetch data from #{self.class} Status: #{response.status} Body: #{response.body.inspect} ERROR end end ``` ### `Provider::Synth` after ```rb class Provider::Synth < Provider include ExchangeRate::Provideable def fetch_exchange_rate(from:, to:, date:) provider_response retries: 2 do response = client.get("#{base_url}/rates/historical") do |req| req.params["date"] = date.to_s req.params["from"] = from req.params["to"] = to end rates = JSON.parse(response.body).dig("data", "rates") ExchangeRate::Provideable::FetchRateData.new( rate: ExchangeRate.new( from_currency: from, to_currency: to, date: date, rate: rates.dig(to) ) ) end end end ``` ## Data Providers The Maybe app utilizes several 3rd party data services to calculate historical account balances, enrich data, and more. Since the app can be run in both "hosted" and "self hosted" mode, this means that data providers are _optional_ for self hosted users and must be configured. Because of this optionality, data providers must be configured at _runtime_ through the @providers.rb module, utilizing @setting.rb for runtime parameters like API keys: ```rb module Providers module_function def synth api_key = ENV.fetch("SYNTH_API_KEY", Setting.synth_api_key) return nil unless api_key.present? Provider::Synth.new(api_key) end end ``` There are two types of 3rd party data in the Maybe app: 1. "Concept" data 2. One-off data ### "Concept" data Since the app is self hostable, users may prefer using different providers for generic data like exchange rates and security prices. When data is generic enough where we can easily swap out different providers, we call it a data "concept". Each "concept" _must_ have a `Provideable` concern that defines the methods that must be implemented along with the data shapes that are returned. For example, an "exchange rates concept" might look like this: ``` app/models/ exchange_rate.rb # <- ActiveRecord model and "concept" exchange_rate/ provided.rb # <- Chooses the provider for this concept based on user settings / config provideable.rb # <- Defines interface for providing exchange rates provider.rb # <- Base provider class provider/ synth.rb # <- Concrete provider implementation ``` Where the `Provideable` and concrete provider implementations would be something like: ```rb # Defines the interface an exchange rate provider must implement module ExchangeRate::Provideable extend ActiveSupport::Concern FetchRateData = Data.define(:rate) FetchRatesData = Data.define(:rates) def fetch_exchange_rate(from:, to:, date:) raise NotImplementedError, "Subclasses must implement #fetch_exchange_rate" end def fetch_exchange_rates(from:, to:, start_date:, end_date:) raise NotImplementedError, "Subclasses must implement #fetch_exchange_rates" end end ``` Any provider that is a valid exchange rate provider must implement this interface: ```rb class ConcreteProvider < Provider include ExchangeRate::Provideable def fetch_exchange_rate(from:, to:, date:) provider_response do ExchangeRate::Provideable::FetchRateData.new( rate: ExchangeRate.new # build response ) end end def fetch_exchange_rates(from:, to:, start_date:, end_date:) # Implementation end end ``` ### One-off data For data that does not fit neatly into a "concept", a `Provideable` is not required and the concrete provider may implement ad-hoc methods called directly in code. For example, the @synth.rb provider has a `usage` method that is only applicable to this specific provider. This should be called directly without any abstractions: ```rb class SomeModel < Application def synth_usage Providers.synth.usage end end ``` ## "Provided" Concerns In general, domain models should not be calling `Providers.some_provider` directly. When 3rd party data is required for a domain model, we use the `Provided` concern within that model's namespace. This concern is primarily responsible for: - Choosing the provider to use for this "concept" - Providing convenience methods on the model for accessing data For example, @exchange_rate.rb has a @provided.rb concern with the following convenience methods: ```rb module ExchangeRate::Provided extend ActiveSupport::Concern class_methods do def provider Providers.synth end def find_or_fetch_rate(from:, to:, date: Date.current, cache: true) # Implementation end def sync_provider_rates(from:, to:, start_date:, end_date: Date.current) # Implementation end end end ``` This exposes a generic access pattern where the caller does not care _which_ provider has been chosen for the concept of exchange rates and can get a predictable response: ```rb def access_patterns_example # Call exchange rate provider directly ExchangeRate.provider.fetch_exchange_rate(from: "USD", to: "CAD", date: Date.current) # Call convenience method ExchangeRate.sync_provider_rates(from: "USD", to: "CAD", start_date: 2.days.ago.to_date) end ``` ## Concrete provider implementations Each 3rd party data provider should have a class under the `Provider::` namespace that inherits from `Provider` and returns `provider_response`, which will return a `Provider::ProviderResponse` object: ```rb class ConcreteProvider < Provider def fetch_some_data provider_response do ExampleData.new( example: "data" ) end end end ``` The `provider_response` automatically catches provider errors, so concrete provider classes should raise when valid data is not possible: ```rb class ConcreteProvider < Provider def fetch_some_data provider_response do data = nil # Raise an error if data cannot be returned raise ProviderError.new("Could not find the data you need") if data.nil? data end end end ```
Sign in to join this conversation.