Skip to content

Commit

Permalink
feat: anthropic livebook
Browse files Browse the repository at this point in the history
  • Loading branch information
TwistingTwists committed Mar 25, 2024
1 parent 4ef6dd8 commit 2ce3f3e
Showing 1 changed file with 134 additions and 0 deletions.
134 changes: 134 additions & 0 deletions pages/llm-providers/anthropic.livemd
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
<!-- livebook:{"persist_outputs":true} -->

# Text Classification - Anthropic

```elixir
Mix.install(
[
{:instructor,
path:
Path.expand("/Users/abhishek/Downloads/perps/experiments/open_source_elixir/instructor_ex")}
],
config: [
instructor: [
adapter: Instructor.Adapters.Anthropic,
anthropic: [api_key: System.fetch_env!("LB_ANTHROPIC_API_KEY")]
]
]
)
```

## Motivation

Text classification is a common task in NLP and broadly applicable across software. Whether it be spam detection, or support ticket categorization, NLP is at the core. Historically, this required training custom, bespoke models that required collecting thousands of pre-labeled examples. With LLMs a lot of this knowledge is already encoded into the model. With proper instruction and guiding the output to a known set of classifications using GPT you can be up and running with a text classification model in no time.

Hell, you can even use instructor to help generate the training set to train your own more efficient model. But let's not get ahead of ourselves, there's more on that later in the tutorials.

## Binary Text Classification

Spam detection is a classic example of binary text classification. It's as simple as returning a true / false of whether an example is in the class. This is pretty trivial to implement in instructor.

````elixir
defmodule SpamPrediction do
use Ecto.Schema
use Instructor.Validator

@doc """
## Field Descriptions:
- class: Whether or not the email is spam.
- reason: A short, less than 10 word rationalization for the classification.
- score: A confidence score between 0.0 and 1.0 for the classification.
"""
@primary_key false
embedded_schema do
field(:class, Ecto.Enum, values: [:spam, :not_spam])
field(:reason, :string)
field(:score, :float)
end

@impl true
def validate_changeset(changeset) do
changeset
|> Ecto.Changeset.validate_number(:score,
greater_than_or_equal_to: 0.0,
less_than_or_equal_to: 1.0
)
end
end

is_spam? = fn text ->
Instructor.chat_completion(
model: "claude-3-haiku-20240307",
response_model: SpamPrediction,
max_retries: 3,
mode: :md_json,
messages: [
%{
role: "user",
content: """
Your purpose is to classify customer support emails as either spam or not.
This is for a clothing retail business.
They sell all types of clothing.
Classify the following email:
```
#{text}
```
"""
}
]
)
end

is_spam?.("Hello I am a Nigerian prince and I would like to send you money")
````

We don't have to stop just at a boolean inclusion, we can also easily extend this idea to multiple categories or classes that we can classify the text into. In this example, let's consider classifying support emails. We want to know whether it's a `general_inquiry`, `billing_issue`, or a `technical_issue` perhaps it rightly fits in multiple classes. This can be useful if we want to cc' specialized support agents when intersecting customer issues occur

We can leverage `Ecto.Enum` to define a schema that restricts the LLM output to be a list of those values. We can also provide a `@doc` description to help guide the LLM with the semantic understanding of what these classifications ought to represent.

```elixir
defmodule EmailClassifications do
use Ecto.Schema

@doc """
A classification of a customer support email.
technical_issue - whether the user is having trouble accessing their account.
billing_issue - whether the customer is having trouble managing their billing or credit card
general_inquiry - all other issues
"""
@primary_key false
embedded_schema do
field(:tags, {:array, Ecto.Enum},
values: [:general_inquiry, :billing_issue, :technical_issue]
)
end
end

classify_email = fn text ->
{:ok, %{tags: result}} =
Instructor.chat_completion(
model: "gpt-3.5-turbo",
response_model: EmailClassifications,
messages: [
%{
role: "user",
content: "Classify the following text: #{text}"
}
]
)

result
end

classify_email.("My account is locked and I can't access my billing info.")
```

<!-- livebook:{"output":true} -->

```
[:technical_issue, :billing_issue]
```

<!-- livebook:{"offset":4335,"stamp":{"token":"XCP.z64pnpwAt12QVZZsOwZ6F4ZHjZz_AQ4EOrHm2Oty-LTs40gELuOk8NpLoz4A0d8TU_d_JsWFYjmtedcbLWMIT5fQxG9kUhv7g339Y6UB8ejsS0VXBeBShcuthFzN","version":2}} -->

0 comments on commit 2ce3f3e

Please sign in to comment.