Implement swappable provider infrastructure #561

Closed
josefarias wants to merge 17 commits from jose/providers into main
josefarias commented 2024-03-25 05:40:27 +08:00 (Migrated from github.com)

This is a reference PR, superseded by https://github.com/maybe-finance/maybe/pull/574

It is meant as an exploration to validate how well a proposed solution might scale.
It might also serve as reference when we decide to implement providers other than Synth.
The superseding PR should be preferred with regards to the small differences between this PR and that one.




Original PR message:

This PR serves as a proposal for a generic data provider system that allows for swapping interchangeable providers.

It shouldn't be merged as-is; as it implements merchant and real estate valuation providers in an effort to show how this solution scales.

Now, on to the PR...

Mental model

Providers are implemented as a Provider::<Name> class. The following providers exist in this PR: Provider::Base, Provider::Synth, Provider::Zillow, Provider::Null.

Provider::Base should never be instantiated. It is meant to hold the (currently small) amount of shared concepts between providers.

Provider::Null is a default provider for use when Maybe doesn't have an opinion on which provider you should use and the user hasn't configured a provider. It's designed to blow up and communicate the service degradation to the user. I think this is better than implementing e.g. Provider::Local, since I don't think it's viable to mimic provider functionality offline.

Each provider class should be thought of as a world separate from Maybe. This allows them to evolve with the external platform without having to worry about how that affects our app. On that same note, I don't recommend sharing much code between providers. Such that they can evolve on their own timeline without affecting our app's internals.

Http calls made to external servers live inside these clases. Domain-agnostic objects are built and returned when the http calls conclude.

Domain-aware objects (i.e. the regular Maybe models) use these provider classes, interpret their responses, and turn them into Maybe domain objects like Transaction, Valuation, or Merchant.

How it works

The chosen providers and their API keys are configured as environment variables, which are read from data_providers.yml. Implementing users only need to change their env vars to change providers.

Models which need provided data include the Providable concern — which knows which Provider::<Name> class to instantiate for each concept depending on data_providers.yml.

Finally, Provider::<Name> classes are instantiated and called wherever needed (class methods, instance methods, scripts, etc) — and their response objects are used and interpreted as necessary.

Enforcing a contract between providers

As we add more providers, it becomes hard to test all possible combinations. If a common provider's interface drifts away from a less common provider's — it can be weeks until we get a report from users saying their setups stopped working.

This is why this PR implements interface testing. By including an interface test in Provider::<Name>Test, you can guarantee that the provider returns an object of the expected shape. If it stops doing so, the test will fail — alerting the developer that their changes either need to be backwards-compatible, or they need to make the change for all providers of the same concept.

Before we merge

The first step is deciding whether this is the path we want to go down. If it is, the next step is reverting the illustrative aspects of this PR — the bits that have to do with merchants and real estate valuations. Finally, I'd suggest merging the remaining code in a new PR, post-review. EDIT: This is what we've decided to do.

# This is a reference PR, superseded by https://github.com/maybe-finance/maybe/pull/574 It is meant as an exploration to validate how well a proposed solution might scale. It might also serve as reference when we decide to implement providers other than Synth. The [superseding PR](https://github.com/maybe-finance/maybe/pull/574) should be preferred with regards to the small differences between this PR and that one. <br> <br> <br> Original PR message: --- This PR serves as a proposal for a generic data provider system that allows for swapping interchangeable providers. It shouldn't be merged as-is; as it implements merchant and real estate valuation providers in an effort to show how this solution scales. Now, on to the PR... ## Mental model Providers are implemented as a `Provider::<Name>` class. The following providers exist in this PR: `Provider::Base`, `Provider::Synth`, `Provider::Zillow`, `Provider::Null`. `Provider::Base` should never be instantiated. It is meant to hold the (currently small) amount of shared concepts between providers. `Provider::Null` is a default provider for use when Maybe doesn't have an opinion on which provider you should use and the user hasn't configured a provider. It's designed to blow up and communicate the service degradation to the user. I think this is better than implementing e.g. `Provider::Local`, since I don't think it's viable to mimic provider functionality offline. Each provider class should be thought of as a world separate from Maybe. This allows them to evolve with the external platform without having to worry about how that affects our app. On that same note, I don't recommend sharing much code between providers. Such that they can evolve on their own timeline without affecting our app's internals. Http calls made to external servers live inside these clases. Domain-agnostic objects are built and returned when the http calls conclude. Domain-aware objects (i.e. the regular Maybe models) use these provider classes, interpret their responses, and turn them into Maybe domain objects like `Transaction`, `Valuation`, or `Merchant`. ## How it works The chosen providers and their API keys are configured as environment variables, which are read from `data_providers.yml`. Implementing users only need to change their env vars to change providers. Models which need provided data include the `Providable` concern — which knows which `Provider::<Name>` class to instantiate for each concept depending on `data_providers.yml`. Finally, `Provider::<Name>` classes are instantiated and called wherever needed (class methods, instance methods, scripts, etc) — and their response objects are used and interpreted as necessary. ## Enforcing a contract between providers As we add more providers, it becomes hard to test all possible combinations. If a common provider's interface drifts away from a less common provider's — it can be weeks until we get a report from users saying their setups stopped working. This is why this PR implements interface testing. By including an interface test in `Provider::<Name>Test`, you can guarantee that the provider returns an object of the expected shape. If it stops doing so, the test will fail — alerting the developer that their changes either need to be backwards-compatible, or they need to make the change for all providers of the same concept. ## Before we merge The first step is deciding whether this is the path we want to go down. If it is, the next step is reverting the illustrative aspects of this PR — the bits that have to do with merchants and real estate valuations. Finally, I'd suggest merging the remaining code in a new PR, post-review. EDIT: This is what we've decided to do.
josefarias (Migrated from github.com) reviewed 2024-03-25 12:39:02 +08:00
josefarias (Migrated from github.com) left a comment

This is ready for specific feedback.

This is ready for specific feedback.
@@ -0,0 +1,4 @@
class Merchant
josefarias (Migrated from github.com) commented 2024-03-25 12:33:10 +08:00

Added so tests don't fail. This would be an ActiveRecord in practice.

Added so tests don't fail. This would be an ActiveRecord in practice.
@@ -0,0 +3,4 @@
class Provider::NullTest < ActiveSupport::TestCase
include ExchangeRateProviderInterfaceTest
include MerchantDataProviderInterfaceTest
include RealEstateValuationsProviderInterfaceTest
josefarias (Migrated from github.com) commented 2024-03-25 12:37:22 +08:00

As much as the dynamically typed nature of ruby gains us, this is one area where type safety would've come in handy. These tests are a good middle ground I think.

I usually wouldn't implement them. But the project being open-source means many hands with varying degrees of familiarity will be touching this system. And I don't think providers will be added often after an initial flurry. So it's easy to forget things.

As much as the dynamically typed nature of ruby gains us, this is one area where type safety would've come in handy. These tests are a good middle ground I think. I usually wouldn't implement them. But the project being open-source means many hands with varying degrees of familiarity will be touching this system. And I don't think providers will be added often after an initial flurry. So it's easy to forget things.
josefarias (Migrated from github.com) reviewed 2024-03-25 12:42:55 +08:00
josefarias (Migrated from github.com) commented 2024-03-25 12:42:55 +08:00

Haven't given any thought to how rate limiting comes into play. I think Provided::<Concept> classes can be aware of how many requests have been performed via Providable and modify their actions accordingly.

Haven't given any thought to how rate limiting comes into play. I think `Provided::<Concept>` classes can be aware of how many requests have been performed via `Providable` and modify their actions accordingly.
rangerscience (Migrated from github.com) reviewed 2024-03-25 15:36:01 +08:00
@@ -0,0 +3,4 @@
class Provider::NullTest < ActiveSupport::TestCase
include ExchangeRateProviderInterfaceTest
include MerchantDataProviderInterfaceTest
include RealEstateValuationsProviderInterfaceTest
rangerscience (Migrated from github.com) commented 2024-03-25 15:36:01 +08:00

Have you played around much with the (newish) type systems in Ruby?

Have you played around much with the (newish) type systems in Ruby?
rangerscience (Migrated from github.com) reviewed 2024-03-25 15:45:41 +08:00
rangerscience (Migrated from github.com) commented 2024-03-25 15:45:41 +08:00

Hmm. If we unroll this -

  def exchange_rates_provider
    self.class.exchange_rates_provider
  end
    def exchange_rates_provider
      provider_for :exchange_rates
    end
      def provider_for(concept)
        "Provider::#{provider_name(concept).camelize}"
          .constantize
          .new(provider_api_key(concept))
      end
      def provider_name(concept)
        providers_config[concept][:provider].presence_in(KNOWN_PROVIDERS) ||
          raise(ArgumentError, "Unknown provider for #{concept}")
      end
shared:
  exchange_rates:
    provider: <%= ENV.fetch "EXCHANGE_RATES_PROVIDER", "synth" %>
    key: <%= ENV.fetch "EXCHANGE_RATES_API_KEY", ENV["SYNTH_API_KEY"] %>

So all this... looks like it evaluates to @provider = Providers::Synth.new?

Update:

I would recommend something like:

@provider = Providers.for(:exchange_rates)

You can then eliminate a few layers of abstraction.

Hmm. If we unroll this - ``` def exchange_rates_provider self.class.exchange_rates_provider end ``` ``` def exchange_rates_provider provider_for :exchange_rates end ``` ``` def provider_for(concept) "Provider::#{provider_name(concept).camelize}" .constantize .new(provider_api_key(concept)) end ``` ``` def provider_name(concept) providers_config[concept][:provider].presence_in(KNOWN_PROVIDERS) || raise(ArgumentError, "Unknown provider for #{concept}") end ``` ``` shared: exchange_rates: provider: <%= ENV.fetch "EXCHANGE_RATES_PROVIDER", "synth" %> key: <%= ENV.fetch "EXCHANGE_RATES_API_KEY", ENV["SYNTH_API_KEY"] %> ``` So all this... looks like it evaluates to `@provider = Providers::Synth.new`? Update: I would recommend something like: ``` @provider = Providers.for(:exchange_rates) ``` You can then eliminate a few layers of abstraction.
rangerscience (Migrated from github.com) reviewed 2024-03-25 16:00:15 +08:00
rangerscience (Migrated from github.com) left a comment

edit (complete comment rewrite): goddamnit nerd sniped myself on this :)

Okay. TL;DR, I think your approach here is overcomplicated things. Seems like you've got a few goals/needs/etc:

  1. You want an implementation pattern that's easy for other people to follow to make new providers
  2. You want a usage pattern so consumers of the code have a reliable, common abstraction, and don't have to worry about the provider's implementation details
  3. You want easy configuration via environment variables so that it's trivial to pick which provider you're using

Here's some quick drafts:

#config/initializers/exchange_rates.rb

name = ENV.fetch("EXCHANGE_RATES_PROVIDER") || "synth"
Providers.exchange_rate = case 
when 'synth', 'zillow'
  "Providers::#{name.titleize}".constantize.new
else
  raise(ArgumentError, "Unknown provider for exchange rates")
end

Right? One place with a very clear link between configuration value and results. Doesn't worry about how each provider is implemented, all it does it choose which one is in use.

class Provider::Synth
  def initialize(api_key=null)
    @api_key = api_key || ENV["SYNTH_API_KEY"]
  end

  def request url
    Faraday.get("#{base_url}/rates/historical") do |req|
        req.headers["Authorization"] = "Bearer #{api_key}"
        req.params = yield
     end
  end
end

Right? The provider is fully and solely responsible for knowing how to instantiate itself... but we can also override that behavior, if we want to, so we can have different instances if we find ourselves wanting to. We can also pull out anything common to every request; this is just one of many ways to do that (and, honestly, one I'd like to explore further).

Now, I'm also getting overly fancy with the yield, and that might not work out, but it's fun for now so eh :)

For errors and retry logic, Faraday has a bunch of things you can use :)

So now we get extremely concise API wrappers:

class Provider::Synth
  def request_exchange_rate(from:, to:, date:)
     request("#{base_url}/rates/historical") { { date: date.to_s, from: to: } }
   end
 end

But, this on it's own doesn't accomplish your #2. We can add a bit:

class Provider::Synth
  def request_exchange_rate(from:, to:, date:)
      request("#{base_url}/rates/historical") { { date: date.to_s, from: to: } }
  end
   
  def to_exchange_rate response, from:, to:, date:
      ExchangeRate.new \
        base_currency: from,
        converted_currency: to,
        rate: JSON.parse(response.body).dig("data", "rates", to),
        date: date
  end
     
  def get_exhange_rate(...)
    to_exchange_rate(request_exchange_rate(...))
  end
end

So now we've got parts. One part that knows how to get the data from Synth, and one part that knows how to turn the data we get from Synth into the data we want to use, and one that sticks them together. Plus, each part is testable independent of each other part, and, we've got a couple different levels we can

PS - ... is valid code; it's the "pass-through" language feature :)

Final note - I really don't mean to be "here's exactly how to do it", I mean it much more like "here's some inspiration for some very simple ways to approach this". The yield might not work out at all, and to_exchange_rate stuff might get wordy and overly boiler-plate. But, I think all this avoids locking future-you into any decisions but the critical ones (how it gets used). On the implementation side of things, all of this should be really flexible and easy to change.

edit (complete comment rewrite): goddamnit nerd sniped myself on this :) Okay. TL;DR, I think your approach here is overcomplicated things. Seems like you've got a few goals/needs/etc: 1. You want an implementation pattern that's easy for other people to follow to make new providers 2. You want a usage pattern so consumers of the code have a reliable, common abstraction, and don't have to worry about the provider's implementation details 3. You want easy configuration via environment variables so that it's trivial to pick which provider you're using Here's some quick drafts: ``` #config/initializers/exchange_rates.rb name = ENV.fetch("EXCHANGE_RATES_PROVIDER") || "synth" Providers.exchange_rate = case when 'synth', 'zillow' "Providers::#{name.titleize}".constantize.new else raise(ArgumentError, "Unknown provider for exchange rates") end ``` Right? One place with a _very_ clear link between configuration value and results. Doesn't worry about how each provider is implemented, all it does it choose which one is in use. ``` class Provider::Synth def initialize(api_key=null) @api_key = api_key || ENV["SYNTH_API_KEY"] end def request url Faraday.get("#{base_url}/rates/historical") do |req| req.headers["Authorization"] = "Bearer #{api_key}" req.params = yield end end end ``` Right? The provider is fully and solely responsible for knowing how to instantiate itself... but we can also override that behavior, if we want to, so we _can_ have different instances if we find ourselves wanting to. We can also pull out anything common to every request; this is just one of many ways to do that (and, honestly, one I'd like to explore further). Now, I'm also getting overly fancy with the `yield`, and that might not work out, but it's fun for now so eh :) For errors and retry logic, [Faraday has a bunch of things you can use](https://github.com/lostisland/awesome-faraday) :) So now we get extremely concise API wrappers: ``` class Provider::Synth def request_exchange_rate(from:, to:, date:) request("#{base_url}/rates/historical") { { date: date.to_s, from: to: } } end end ``` But, this on it's own doesn't accomplish your #2. We can add a bit: ``` class Provider::Synth def request_exchange_rate(from:, to:, date:) request("#{base_url}/rates/historical") { { date: date.to_s, from: to: } } end def to_exchange_rate response, from:, to:, date: ExchangeRate.new \ base_currency: from, converted_currency: to, rate: JSON.parse(response.body).dig("data", "rates", to), date: date end def get_exhange_rate(...) to_exchange_rate(request_exchange_rate(...)) end end ``` So now we've got parts. One part that knows how to get the data from Synth, and one part that knows how to turn the data we get from Synth into the data we want to use, and one that sticks them together. Plus, each part is testable independent of each other part, and, we've got a couple different levels we can PS - `...` is valid code; it's the ["pass-through"](https://sikac.hu/writing-a-pass-through-method-ruby-2-7-edition-a336239f2a61) language feature :) Final note - I really don't mean to be "here's exactly how to do it", I mean it much more like "here's some inspiration for some very simple ways to approach this". The `yield` might not work out at all, and `to_exchange_rate` stuff might get wordy and overly boiler-plate. But, I think all this avoids locking future-you into any decisions but the critical ones (how it gets used). On the implementation side of things, all of this should be really flexible and easy to change.
rangerscience (Migrated from github.com) reviewed 2024-03-25 16:03:01 +08:00
@@ -4,0 +15,4 @@
config.filter_sensitive_data("<EXCHANGE_RATES_API_KEY>") { ENV["EXCHANGE_RATES_API_KEY"] }
config.filter_sensitive_data("<MERCHANT_DATA_API_KEY>") { ENV["MERCHANT_DATA_API_KEY"] }
config.filter_sensitive_data("<REAL_ESTATE_VALUATIONS_API_KEY>") { ENV["REAL_ESTATE_VALUATIONS_API_KEY"] }
end
rangerscience (Migrated from github.com) commented 2024-03-25 16:03:00 +08:00

VCR is amazing :) I've got a couple of utility functions you might find handy, let me get back to you tomorrow...

VCR is amazing :) I've got a couple of utility functions you might find handy, let me get back to you tomorrow...
rangerscience (Migrated from github.com) reviewed 2024-03-25 16:13:34 +08:00
@@ -0,0 +7,4 @@
key: <%= ENV.fetch "MERCHANT_DATA_API_KEY", ENV["SYNTH_API_KEY"] %>
reale_state_valuations:
provider: <%= ENV.fetch "REAL_ESTATE_VALUATIONS_PROVIDER", "null" %>
key: <%= ENV["REAL_ESTATE_VALUATIONS_API_KEY"] %>
rangerscience (Migrated from github.com) commented 2024-03-25 16:13:34 +08:00

I think I'd recommend something like this, instead:

providers:
  exhange_rates: <%= ENV.fetch "EXCHANGE_RATES_PROVIDER", "synth" %>
synth:
  key: ENV["SYNTH_API_KEY"]

Mostly, this lets each provider define it's own instantiation, but still lets your user define which provider they want to use.

For example, I think it's Google Cloud whose library wants a JSON file of credentials, instead of an API key.

Like - hmm. Your abstraction isn't "all providers are the same", it's, "all providers can produce the same info". So you don't need the configuration of each provider to be identical.

I think I'd recommend something like this, instead: ``` providers: exhange_rates: <%= ENV.fetch "EXCHANGE_RATES_PROVIDER", "synth" %> synth: key: ENV["SYNTH_API_KEY"] ``` Mostly, this lets each provider define it's own instantiation, but still lets your user define which provider they want to use. For example, I think it's Google Cloud whose library wants a JSON file of credentials, instead of an API key. Like - hmm. Your abstraction _isn't_ "all providers are the same", it's, "all providers can produce the same info". So you don't need the configuration of each provider to be identical.
rangerscience (Migrated from github.com) reviewed 2024-03-25 16:18:42 +08:00
@@ -0,0 +1,19 @@
module Retryable
rangerscience (Migrated from github.com) commented 2024-03-25 16:18:42 +08:00

Faraday has a pretty sophisticated retry package: https://github.com/lostisland/faraday-retry

Faraday has a pretty sophisticated retry package: https://github.com/lostisland/faraday-retry
rangerscience (Migrated from github.com) reviewed 2024-03-25 16:22:32 +08:00
@@ -0,0 +1,83 @@
class Provider::Synth
rangerscience (Migrated from github.com) commented 2024-03-25 16:22:32 +08:00

hmm I think you can have som fun like:

def request url
  response = Faraday.get("#{base_url}/rates/historical") do |req|
      req.headers["Authorization"] = "Bearer #{api_key}"
      req.params = yield
   end 
   
   # ...
end

def fetch_exchange_rate(from:, to:, date:)
  request("#{base_url}/rates/historical") do
    { date: date.to_s, from: to: }
  end
end
hmm I think you can have som fun like: ``` def request url response = Faraday.get("#{base_url}/rates/historical") do |req| req.headers["Authorization"] = "Bearer #{api_key}" req.params = yield end # ... end def fetch_exchange_rate(from:, to:, date:) request("#{base_url}/rates/historical") do { date: date.to_s, from: to: } end end ```
josefarias (Migrated from github.com) reviewed 2024-03-25 19:47:34 +08:00
josefarias (Migrated from github.com) commented 2024-03-25 19:47:34 +08:00

That eliminates the two outermost layers of abstraction only, which are the thinnest wrappers.

Those were implemented on purpose. I think it's better for the caller to only know the name of a method rather than the name of a module, a method, and a slightly magical symbol argument.

Personally not worried about those two thin wrappers. They're meant to provide a purposeful public API.

EDIT: to be clear, I mean Providers.for(:exchange_rates) would have to implement some logic equivalent to the innermost layers. Even if not identical to the current proposal, they'd be equivalent.

That eliminates the two outermost layers of abstraction only, which are the thinnest wrappers. Those were implemented on purpose. I think it's better for the caller to only know the name of a method rather than the name of a module, a method, and a slightly magical symbol argument. Personally not worried about those two thin wrappers. They're meant to provide a purposeful public API. EDIT: to be clear, I mean `Providers.for(:exchange_rates)` would have to implement some logic equivalent to the innermost layers. Even if not identical to the current proposal, they'd be equivalent.
josefarias (Migrated from github.com) reviewed 2024-03-25 19:54:01 +08:00
@@ -0,0 +7,4 @@
key: <%= ENV.fetch "MERCHANT_DATA_API_KEY", ENV["SYNTH_API_KEY"] %>
reale_state_valuations:
provider: <%= ENV.fetch "REAL_ESTATE_VALUATIONS_PROVIDER", "null" %>
key: <%= ENV["REAL_ESTATE_VALUATIONS_API_KEY"] %>
josefarias (Migrated from github.com) commented 2024-03-25 19:54:01 +08:00

We don't need the configuration of each provider to be identical. But it's rather nice that it is.

I can see some value in adding the provider names as keys here, instead of generic "CONCEPT_PROVIDER_API_KEY". Kind of annoying that the example .env file would get bigger with each provider.

I originally had even removed SYNTH_API_KEY, but left it in as a fallback because CI would break otherwise. I think if we want to remove the synth one we'd have to add the others as secrets to GH cc @zachgoll up to you!

We don't need the configuration of each provider to be identical. But it's rather nice that it is. I can see some value in adding the provider names as keys here, instead of generic "CONCEPT_PROVIDER_API_KEY". Kind of annoying that the example .env file would get bigger with each provider. I originally had even removed SYNTH_API_KEY, but left it in as a fallback because CI would break otherwise. I think if we want to remove the synth one we'd have to add the others as secrets to GH cc @zachgoll up to you!
josefarias (Migrated from github.com) reviewed 2024-03-25 19:56:32 +08:00
@@ -0,0 +1,19 @@
module Retryable
josefarias (Migrated from github.com) commented 2024-03-25 19:56:32 +08:00

Ah didn't know about this. I could go either way. This is a pretty simple implementation.

Unless that gem can effectively handle rate limiting out of the box the way we need it. In that case I'd probably go with the gem.

Ah didn't know about this. I could go either way. This is a pretty simple implementation. Unless that gem can effectively handle rate limiting out of the box the way we need it. In that case I'd probably go with the gem.
josefarias (Migrated from github.com) reviewed 2024-03-25 19:58:21 +08:00
@@ -0,0 +1,83 @@
class Provider::Synth
josefarias (Migrated from github.com) commented 2024-03-25 19:58:21 +08:00

Yeah I've done this kinda thing in HTTP adapters before. It's dry-er. But I always ended up regretting it. It's easier to come back to all the request details being colocated.

Yeah I've done this kinda thing in HTTP adapters before. It's dry-er. But I always ended up regretting it. It's easier to come back to all the request details being colocated.
josefarias (Migrated from github.com) reviewed 2024-03-25 19:59:06 +08:00
@@ -0,0 +1,83 @@
class Provider::Synth
josefarias (Migrated from github.com) commented 2024-03-25 19:59:06 +08:00

Until they get out of hand that is. This kind of cleanup is always a small effort away if needed.

Until they get out of hand that is. This kind of cleanup is always a small effort away if needed.
josefarias commented 2024-03-25 20:09:15 +08:00 (Migrated from github.com)

Thanks for your comments @rangerscience! Responded to some of them.

This proposal prioritizes clear extension points, colocation of behavior, and minimal configuration.

I can appreciate the desire for less layers. But, as presented, I think that would be to the detriment of the colocation and clear delineation of responsibilities.

Less code is nice. But a clear mental model is nicer.

Debatable whether this mental model is actually clear. It is to me. But happy to have the maintainers disagree. All good!

Finally, I don't think what you're proposing paints a full picture. It doesn't consider implications of instantiating providers for more than one concept. And the obviation of the Provided::<Concept> classes will lead to each provider implementing the same logic to instantiate models outside their domain. There might be something to not needing to pass the API key to the provider class, though.

Thanks for your comments @rangerscience! Responded to some of them. This proposal prioritizes clear extension points, colocation of behavior, and minimal configuration. I can appreciate the desire for less layers. But, as presented, I think that would be to the detriment of the colocation and clear delineation of responsibilities. Less code is nice. But a clear mental model is nicer. Debatable whether this mental model is actually clear. It is to me. But happy to have the maintainers disagree. All good! Finally, I don't think what you're proposing paints a full picture. It doesn't consider implications of instantiating providers for more than one concept. And the obviation of the `Provided::<Concept>` classes will lead to each provider implementing the same logic to instantiate models outside their domain. There might be something to not needing to pass the API key to the provider class, though.
josefarias (Migrated from github.com) reviewed 2024-03-25 20:15:46 +08:00
@@ -0,0 +3,4 @@
class Provider::NullTest < ActiveSupport::TestCase
include ExchangeRateProviderInterfaceTest
include MerchantDataProviderInterfaceTest
include RealEstateValuationsProviderInterfaceTest
josefarias (Migrated from github.com) commented 2024-03-25 20:15:46 +08:00

I haven't! Considered those for this. But I wouldn't recommend we adopt them for the whole codebase. Seems worse to use them for a subset of the app than to implement these tests which are a simple include away.

I haven't! Considered those for this. But I wouldn't recommend we adopt them for the whole codebase. Seems worse to use them for a subset of the app than to implement these tests which are a simple `include` away.
josefarias commented 2024-03-25 22:47:10 +08:00 (Migrated from github.com)

@zachgoll one thing we could do to avoid premature abstraction (which depending on internal conversations I think this PR might do) is remove everything swapping-related and just always instantiate Provider::Synth within Providable. I'd also remove the interface tests in that case. Would be left with a pretty lean PR and we can keep this as reference for when we do want to add other providers.

@zachgoll one thing we could do to avoid premature abstraction (which depending on internal conversations I think this PR might do) is remove everything swapping-related and just always instantiate `Provider::Synth` within `Providable`. I'd also remove the interface tests in that case. Would be left with a pretty lean PR and we can keep this as reference for when we do want to add other providers.
zachgoll (Migrated from github.com) reviewed 2024-03-26 00:37:31 +08:00
zachgoll (Migrated from github.com) left a comment

@josefarias thanks for taking a pass at this! I've left a few comments and overall thoughts below.

Overall thoughts

I'm good with the overall direction here. I like how this optimizes for clear public APIs and easy provider configuration for self-hosting. Obviously lots of evolutions to go through with this as we fill in the implementation, but at the very least, this forces us to think proactively about when and how we're introducing new providers and implements a clear boundary between the Maybe domain and external world.

RE: Premature abstraction

While I agree it's a bit smelly, I do think this is a rare case where we already know the importance/necessity of data providers from Maybe 1.0 which makes this one worth thinking through earlier than later. And if we end up evolving or completely changing this down the road, totally fine with that too.

That said, I do really like your idea of instantiating Provider::Synth directly within Providable and leaving this PR as the "guide" for future extension. I think we could:

  1. Leave this PR exactly as-is and close out
  2. Add a comment with a link to this PR in Providable
  3. Open a new PR that is slimmed down

Ultimately, I think what you have here is a good long term direction that we'll be able to make even clearer as we start implementing real use-cases.

Would you agree with that approach?

@josefarias thanks for taking a pass at this! I've left a few comments and overall thoughts below. ## Overall thoughts I'm good with the overall direction here. I like how this optimizes for clear public APIs and easy provider configuration for self-hosting. Obviously lots of evolutions to go through with this as we fill in the implementation, but at the very least, this forces us to think proactively about _when_ and _how_ we're introducing new providers and implements a clear boundary between the Maybe domain and external world. ## RE: Premature abstraction While I agree it's a bit smelly, I do think this is a rare case where we already know the importance/necessity of data providers from Maybe 1.0 which makes this one worth thinking through earlier than later. And if we end up evolving or completely changing this down the road, totally fine with that too. That said, I do really like your idea of instantiating `Provider::Synth` directly within `Providable` and leaving this PR as the "guide" for future extension. I think we could: 1. Leave this PR exactly as-is and close out 2. Add a comment with a link to this PR in `Providable` 3. Open a new PR that is slimmed down Ultimately, I think what you have here is a good long term direction that we'll be able to make even clearer as we start implementing real use-cases. Would you agree with that approach?
@@ -1,42 +1,22 @@
class ExchangeRate < ApplicationRecord
include Provided
zachgoll (Migrated from github.com) commented 2024-03-26 00:34:57 +08:00

Open to opinions on this, but I wonder if we should be exposing both a find_rate and find_rate_or_fetch so that clients can "opt-in" to certain fallback behavior.

I've been going back and forth on this, but starting to lean towards treating provider data as a "Write only" concern. In other words, methods like Money.exchange_to should either find something in the database or return nil and we leave the responsibility of writing exchange rates solely to the account sync process.

In other words, the app works without any provider data and we know that operations like Money.exchange_to are "free" (of both side-effects and API usage).

Any thoughts/opinions on that?

Open to opinions on this, but I wonder if we should be exposing both a `find_rate` and `find_rate_or_fetch` so that clients can "opt-in" to certain fallback behavior. I've been going back and forth on this, but starting to lean towards treating provider data as a "Write only" concern. In other words, methods like `Money.exchange_to` should either find something in the database or return `nil` and we leave the responsibility of writing exchange rates solely to the [account sync process](https://github.com/maybe-finance/maybe/blob/110855d077a0a501674d5866fe43844df93c3159/app/models/account/syncable.rb#L64-L66). In other words, the app works without any provider data and we know that operations like `Money.exchange_to` are "free" (of both side-effects and API usage). Any thoughts/opinions on that?
zachgoll (Migrated from github.com) commented 2024-03-25 23:24:16 +08:00

Yeah I think we're fine to revisit rate-limiting later down the road.

Yeah I think we're fine to revisit rate-limiting later down the road.
zachgoll (Migrated from github.com) commented 2024-03-25 23:38:02 +08:00

I'd probably lean towards optimizing for a clearer public API with a few extra layers of abstraction at the Providable layer as @josefarias has implemented here. Providable shouldn't change often and even with a large number of providers I think the path to extension is pretty clear and trivial.

I'd probably lean towards optimizing for a clearer public API with a few extra layers of abstraction at the `Providable` layer as @josefarias has implemented here. `Providable` shouldn't change often and even with a large number of providers I think the path to extension is pretty clear and trivial.
zachgoll (Migrated from github.com) commented 2024-03-25 23:55:11 +08:00

I see for each of the Provided:: classes, we're implementing a single fetch method. As we think about extending these to fetch several types of data, are you thinking these classes will adopt additional public methods? Or are you thinking for each new type of data, we introduce a brand new Provided:: class with a single fetch?

For example, I'm guessing if we wanted to implement a method to fetch a historical series of exchange rates from the provider, we'd just throw a fetch_historical(from:, to:, date_range:) directly in this class?

I see for each of the `Provided::` classes, we're implementing a single `fetch` method. As we think about extending these to fetch several types of data, are you thinking these classes will adopt additional public methods? Or are you thinking for each new type of data, we introduce a brand new `Provided::` class with a single `fetch`? For example, I'm guessing if we wanted to implement a method to fetch a historical series of exchange rates from the provider, we'd just throw a `fetch_historical(from:, to:, date_range:)` directly in this class?
@@ -0,0 +7,4 @@
key: <%= ENV.fetch "MERCHANT_DATA_API_KEY", ENV["SYNTH_API_KEY"] %>
reale_state_valuations:
provider: <%= ENV.fetch "REAL_ESTATE_VALUATIONS_PROVIDER", "null" %>
key: <%= ENV["REAL_ESTATE_VALUATIONS_API_KEY"] %>
zachgoll (Migrated from github.com) commented 2024-03-26 00:01:13 +08:00

I'd probably vote for explicitness over brevity. I think a longer .env.example file is totally fine as long as we have good defaults.

I think the app should work without any of these environment variables, and then as the self-hoster adds in providers, the app functionality is enhanced.

We'll have several flows within the app UI that alert the user that different providers are missing and explains how it is affecting their experience. For example, they might see an alert that says:

You have not configured an exchange rate provider. Your historical data and "rollups" will hide foreign currency accounts and your stats will be inaccurate until you provide an API key.

I'd probably vote for explicitness over brevity. I think a longer `.env.example` file is totally fine as long as we have good defaults. I think the app should work without _any_ of these environment variables, and then as the self-hoster adds in providers, the app functionality is _enhanced_. We'll have several flows within the app UI that alert the user that different providers are missing and explains how it is affecting their experience. For example, they might see an alert that says: > You have not configured an exchange rate provider. Your historical data and "rollups" will hide foreign currency accounts and your stats will be inaccurate until you provide an API key.
@@ -0,0 +1,19 @@
module Retryable
zachgoll (Migrated from github.com) commented 2024-03-26 00:04:16 +08:00

I think this is simple enough to start. Would rather try our own implementation first and resort to gems only if 100% necessary.

I think this is simple enough to start. Would rather try our own implementation first and resort to gems only if 100% necessary.
rangerscience (Migrated from github.com) reviewed 2024-03-26 00:47:49 +08:00
@@ -0,0 +1,19 @@
module Retryable
rangerscience (Migrated from github.com) commented 2024-03-26 00:47:49 +08:00

The less you do, the less you have to maintain :)

The less you do, the less you have to maintain :)
rangerscience (Migrated from github.com) reviewed 2024-03-26 00:50:29 +08:00
@@ -0,0 +7,4 @@
key: <%= ENV.fetch "MERCHANT_DATA_API_KEY", ENV["SYNTH_API_KEY"] %>
reale_state_valuations:
provider: <%= ENV.fetch "REAL_ESTATE_VALUATIONS_PROVIDER", "null" %>
key: <%= ENV["REAL_ESTATE_VALUATIONS_API_KEY"] %>
rangerscience (Migrated from github.com) commented 2024-03-26 00:50:29 +08:00

We don't need the configuration of each provider to be identical. But it's rather nice that it is.

Right. So, don't lock yourself into requiring them to be identical - then you can have them identical anyway, but the ones that don't fit aren't forced to conform.

> We don't need the configuration of each provider to be identical. But it's rather nice that it is. Right. So, don't lock yourself into requiring them to be identical - then you _can_ have them identical anyway, but the ones that don't fit aren't forced to conform.
rangerscience (Migrated from github.com) reviewed 2024-03-26 00:51:53 +08:00
@@ -0,0 +1,83 @@
class Provider::Synth
rangerscience (Migrated from github.com) commented 2024-03-26 00:51:53 +08:00

Fair! I would at least reduce the boiler plate with:

req.params = {
   date: 
}

Just makes it a smidge easier to modify (and read, I'd say).

Fair! I would at least reduce the boiler plate with: ``` req.params = { date: } ``` Just makes it a smidge easier to modify (and read, I'd say).
josefarias (Migrated from github.com) reviewed 2024-03-26 01:01:18 +08:00
@@ -0,0 +7,4 @@
key: <%= ENV.fetch "MERCHANT_DATA_API_KEY", ENV["SYNTH_API_KEY"] %>
reale_state_valuations:
provider: <%= ENV.fetch "REAL_ESTATE_VALUATIONS_PROVIDER", "null" %>
key: <%= ENV["REAL_ESTATE_VALUATIONS_API_KEY"] %>
josefarias (Migrated from github.com) commented 2024-03-26 01:01:18 +08:00

I don't think we need to solve for this right now. Any non-standard behavior can be addressed in the concrete provider initializer. Also not sure data providers would require much more than an API key usually — I don't anticipate Google Cloud or similar being a data provider in this capacity.

Might push up a commit where the provider classes know about their provider-specific env vars. Would certainly be nice to be able to instantiate them without further config. Will revisit after work.

I don't think we need to solve for this right now. Any non-standard behavior can be addressed in the concrete provider initializer. Also not sure data providers would require much more than an API key usually — I don't anticipate Google Cloud or similar being a data provider in this capacity. Might push up a commit where the provider classes know about their provider-specific env vars. Would certainly be nice to be able to instantiate them without further config. Will revisit after work.
rangerscience commented 2024-03-26 01:04:17 +08:00 (Migrated from github.com)

Finally, I don't think what you're proposing paints a full picture.

Heh :) I mean, I nerd sniped myself at 3:00 AM, so yeah, doesn't paint a full picture :)

I think the most critical bit I'm adding is pulling apart the code that chooses which provider gets used, from the code that defines those providers.

It doesn't consider implications of instantiating providers for more than one concept.

I don't think I follow? If you have more concepts, you just have more Providers.concept initializers. If you have more than one provider for a given concept, you just wire them into those initializers; and, if you've got common code, pull it out into (as you propose) Provider::Concept base classes (or any of the other ways to handle that in Ruby). There just isn't any common to pull out yet.

I'd toss in: What happens when you have a provider that provides a unique mix of multiple concepts? Like, if Zillow does both exchange rates and housing prices, while Wall Street Bets does exchange rates and stock prices?

And the obviation of the Provided::<Concept> classes will lead to each provider implementing the same logic to instantiate models outside their domain.

Don't need it yet, and you add it when you've got common code to start pulling out. If you want to do something resembling interface definitions, I'd do it in test code rather than inheritance stuff - Ruby's a duck-typed language, so it... "fits" a bit better to write up tests that say "in order to be an X, it has to pass these tests", than to define and interface you inherit from.

There might be something to not needing to pass the API key to the provider class, though.

Yah! I mean, for most things that's what you want, but, how many times have you run across an API that tries to get weird with it (looking at you, GCP, with your credentials file)?

> Finally, I don't think what you're proposing paints a full picture. Heh :) I mean, I nerd sniped myself at 3:00 AM, so yeah, doesn't paint a full picture :) I think the most critical bit I'm adding is pulling apart the code that chooses _which_ provider gets used, from the code that defines those providers. > It doesn't consider implications of instantiating providers for more than one concept. I don't think I follow? If you have more concepts, you just have more `Providers.concept` initializers. If you have more than one provider for a given concept, you just wire them into those initializers; and, if you've got common code, pull it out into (as you propose) `Provider::Concept` base classes (or any of the other ways to handle that in Ruby). There just isn't any common to pull out yet. I'd toss in: What happens when you have a provider that provides a unique mix of multiple concepts? Like, if Zillow does both exchange rates and housing prices, while Wall Street Bets does exchange rates and stock prices? > And the obviation of the `Provided::<Concept>` classes will lead to each provider implementing the same logic to instantiate models outside their domain. Don't need it yet, and you add it when you've got common code to start pulling out. If you want to do something resembling interface definitions, I'd do it in test code rather than inheritance stuff - Ruby's a duck-typed language, so it... "fits" a bit better to write up tests that say "in order to be an X, it has to pass these tests", than to define and interface you inherit from. > There might be something to not needing to pass the API key to the provider class, though. Yah! I mean, for most things that's what you want, but, how many times have you run across an API that tries to get weird with it (looking at you, GCP, with your credentials _file_)?
josefarias commented 2024-03-26 01:20:20 +08:00 (Migrated from github.com)

I don't think I follow?

We're on the same page. I meant your proposal didn't include that code where you add more concepts, ultimately calling for an abstraction. I just went directly to that abstraction because this PR is about sketching out a multi-concept, multi-provider, scaled, solution. That's what was asked for per the learnings in Maybe 1.0. This is not meant to be merged as-is — as articulated in the PR description.

What happens when you have a provider that provides a unique mix of multiple concepts?

I think that's very reasonably addressed in this PR. Synth, Null, and Zillow provide a mix of multiple concepts. Some shared, some not.

If you want to do something resembling interface definitions, I'd do it in test code rather than inheritance stuff

Agreed. It's done in test code in this PR.


@zachgoll thanks for the comments! I'll look into those after work. That plan sounds perfect. I'll push up some final changes and post some notes for future consideration — will then close this and implement a pared down, only-what-we-need-right-now version in a separate PR as described in our comments above. Which I think is what @rangerscience is advocating for.

> I don't think I follow? We're on the same page. I meant your proposal didn't include that code where you add more concepts, ultimately calling for an abstraction. I just went directly to that abstraction because this PR is about sketching out a multi-concept, multi-provider, scaled, solution. That's what was asked for per the learnings in Maybe 1.0. This is not meant to be merged as-is — as articulated in the PR description. > What happens when you have a provider that provides a unique mix of multiple concepts? I think that's very reasonably addressed in this PR. Synth, Null, and Zillow provide a mix of multiple concepts. Some shared, some not. > If you want to do something resembling interface definitions, I'd do it in test code rather than inheritance stuff Agreed. It's done in test code in this PR. --- @zachgoll thanks for the comments! I'll look into those after work. That plan sounds perfect. I'll push up some final changes and post some notes for future consideration — will then close this and implement a pared down, only-what-we-need-right-now version in a separate PR as described in our comments above. Which I think is what @rangerscience is advocating for.
rangerscience commented 2024-03-26 03:00:50 +08:00 (Migrated from github.com)

Pleasure discussing all this with y'all, and thanks for my first real experience participating in open source :)

separate PR

I might chuck in a link to these PRs in code comments, so that it's easy for someone to find all this additional context :)

Pleasure discussing all this with y'all, and thanks for my first real experience participating in open source :) > separate PR I might chuck in a link to these PRs in code comments, so that it's easy for someone to find all this additional context :)
josefarias (Migrated from github.com) reviewed 2024-03-26 09:16:30 +08:00
josefarias (Migrated from github.com) commented 2024-03-26 09:16:30 +08:00

You know? Come to think of it, I think this abstraction might be anemic.

I think we can just call the provider and instantiate a record from within the model. Would probably live within its own concern.

This would happen inside ExchangeRate.fetch_from_provider and ExchangeRate.fetch_historical_from_provider methods.

I'll spike on it tonight and see where it gets us. Will probably move the retryable stuff to the provider instead.

You know? Come to think of it, I think this abstraction might be anemic. I think we can just call the provider and instantiate a record from within the model. Would probably live within its own concern. This would happen inside `ExchangeRate.fetch_from_provider` and `ExchangeRate.fetch_historical_from_provider` methods. I'll spike on it tonight and see where it gets us. Will probably move the retryable stuff to the provider instead.
josefarias (Migrated from github.com) reviewed 2024-03-26 09:19:16 +08:00
@@ -1,42 +1,22 @@
class ExchangeRate < ApplicationRecord
include Provided
josefarias (Migrated from github.com) commented 2024-03-26 09:19:16 +08:00

Yes. I think that makes perfect sense. Definitely smelly that we'd write to the database when fetching something. And if done in an HTTP GET request, that's even worse! Great callout.

Yes. I think that makes perfect sense. Definitely smelly that we'd write to the database when fetching something. And if done in an HTTP GET request, that's even worse! Great callout.
josefarias (Migrated from github.com) reviewed 2024-03-26 10:59:51 +08:00
@@ -0,0 +7,4 @@
key: <%= ENV.fetch "MERCHANT_DATA_API_KEY", ENV["SYNTH_API_KEY"] %>
reale_state_valuations:
provider: <%= ENV.fetch "REAL_ESTATE_VALUATIONS_PROVIDER", "null" %>
key: <%= ENV["REAL_ESTATE_VALUATIONS_API_KEY"] %>
josefarias (Migrated from github.com) commented 2024-03-26 10:59:51 +08:00
https://github.com/maybe-finance/maybe/pull/561/commits/ef7ced2e4c4fb752425e219a717ac90ae4424c1b
josefarias (Migrated from github.com) reviewed 2024-03-26 11:02:01 +08:00
@@ -0,0 +1,83 @@
class Provider::Synth
josefarias (Migrated from github.com) commented 2024-03-26 11:02:01 +08:00

Moved some things around in c49782ebfa and these classes ended up arguably messier.

I think that's fine. It's all self contained and easily refactored as patterns emerge within each provider.

Gut tells me to avoid sharing too much code between providers. Yes, they're all providers in the realm of Maybe. But they're each their own thing and have their own lifecycles. Probably best not to tie them together?

Moved some things around in https://github.com/maybe-finance/maybe/pull/561/commits/c49782ebfab9289bfd871c4616bd64e48fa063b8 and these classes ended up arguably messier. I think that's fine. It's all self contained and easily refactored as patterns emerge within each provider. Gut tells me to avoid sharing too much code between providers. Yes, they're all providers in the realm of Maybe. But they're each their own thing and have their own lifecycles. Probably best not to tie them together?
josefarias (Migrated from github.com) reviewed 2024-03-26 11:02:40 +08:00
josefarias (Migrated from github.com) commented 2024-03-26 11:02:39 +08:00
https://github.com/maybe-finance/maybe/pull/561/commits/c49782ebfab9289bfd871c4616bd64e48fa063b8
josefarias commented 2024-03-26 12:31:08 +08:00 (Migrated from github.com)

I've moved some things around. Left a summary in the commit message but re-sharing here for convenience:

Nix anemic abstractions

* The `Provided::<Concept>` classes were anemic. It's fine for the model to know
how to call a provider that provides it directly. It's also better to have the model instantiate
domain objects. And, finally, it makes more sense for the class making
the HTTP requests to be aware of retries.

* The base implementations for provider methods were moved from `Provider::Base` to
`Provider::Null`. They were part of the base so that every provider responds to every
providable method, which is unnecessary. Let's be less defensive about it.
Things will likely break with NoMethodError anyway in the edge case
this was trying to guard against.

* Previously, when HTTP providers like Synth failed, they would raise.
Now they return an inspectable response object of the same shape as when requests are successful,
but it returns false for `#success?`. This is handy for debugging when you'll likely
want to inspect the raw response. We can still raise on the Maybe side if needed.
But that's a Maybe concern not a provider concern.

* `ExchangeRate#find_rate` doesn't raise anymore in preparation for opting into fetching —
which should only happen on writes.

Will update the PR description tomorrow to match.

Ran out of time today to cherry pick just the parts we need. I'll do that tomorrow. Of course, the Maybe team should feel free to do it themselves if they're in a hurry.

Gotta say I really like where we're at with this. And I like the approach we took: take the learnings from a previously live product, validate the approach by scaling it artificially, implement a pared down version of the validated approach. Let's hope it pans out in the long run as well!

I've moved some things around. Left a summary in the commit message but re-sharing here for convenience: ``` Nix anemic abstractions * The `Provided::<Concept>` classes were anemic. It's fine for the model to know how to call a provider that provides it directly. It's also better to have the model instantiate domain objects. And, finally, it makes more sense for the class making the HTTP requests to be aware of retries. * The base implementations for provider methods were moved from `Provider::Base` to `Provider::Null`. They were part of the base so that every provider responds to every providable method, which is unnecessary. Let's be less defensive about it. Things will likely break with NoMethodError anyway in the edge case this was trying to guard against. * Previously, when HTTP providers like Synth failed, they would raise. Now they return an inspectable response object of the same shape as when requests are successful, but it returns false for `#success?`. This is handy for debugging when you'll likely want to inspect the raw response. We can still raise on the Maybe side if needed. But that's a Maybe concern not a provider concern. * `ExchangeRate#find_rate` doesn't raise anymore in preparation for opting into fetching — which should only happen on writes. ``` Will update the PR description tomorrow to match. Ran out of time today to cherry pick just the parts we need. I'll do that tomorrow. Of course, the Maybe team should feel free to do it themselves if they're in a hurry. Gotta say I really like where we're at with this. And I like the approach we took: take the learnings from a previously live product, validate the approach by scaling it artificially, implement a pared down version of the validated approach. Let's hope it pans out in the long run as well!
josefarias commented 2024-03-26 12:45:08 +08:00 (Migrated from github.com)

Closing before cherry-picking the immediately relevant bits in a separate PR.

This can be used as reference when we introduce new concepts and providers. Judgement should be used considering this PR was based on assumptions about how non-exchange rate concepts would work.

Closing before cherry-picking the immediately relevant bits in a separate PR. This can be used as reference when we introduce new concepts and providers. Judgement should be used considering this PR was based on assumptions about how non-exchange rate concepts would work.
zachgoll commented 2024-03-26 20:08:29 +08:00 (Migrated from github.com)

@josefarias awesome, I agree that this is at a good spot. The mental model is pretty clear and intuitive to me so hopefully that holds true for future contributors to the project as well and gives us a good launch point. Will go check out the new PR when it's up!

@josefarias awesome, I agree that this is at a good spot. The mental model is pretty clear and intuitive to me so hopefully that holds true for future contributors to the project as well and gives us a good launch point. Will go check out the new PR when it's up!
josefarias commented 2024-03-27 11:43:57 +08:00 (Migrated from github.com)
New PR is up https://github.com/maybe-finance/maybe/pull/574 @zachgoll

Pull request closed

Sign in to join this conversation.