NLP Examples
Tags: #naas #nlp #snippet
Author: Jeremy Ravenel

How it works?

Naas NLP formulas follow this format.
1
nlp.get(task, model, tokenizer)(inputs)
Copied!
The supported tasks are the following:
  • text-generation (model: GPT2)
  • summarization (model: t5-small)
  • fill-mask (model: distilroberta-base)
  • text-classification (model: distilbert-base-uncased-finetuned-sst-2-english)
  • feature-extraction (model: distilbert-base-cased)
  • token-classification (model: dslim/bert-base-NER)
  • question-answering
  • translation
We use Hugging Face API under the hood to access the models.

Input

Import library

1
from naas_drivers import nlp
Copied!

Model

Text Generation

1
nlp.get("text-generation", model="gpt2", tokenizer="gpt2")("What is the most important thing in your life right now?")
Copied!

Text Summarization

Summarize the text given, maximum lenght (number of tokens/words) is set to 200.
1
nlp.get("summarization", model="t5-small", tokenizer="t5-small")('''
2
3
There will be fewer and fewer jobs that a robot cannot do better.
4
What to do about mass unemployment this is gonna be a massive social challenge and
5
I think ultimately we will have to have some kind of universal basic income.
6
7
I think some kind of a universal basic income is going to be necessary
8
now the output of goods and services will be extremely high
9
so with automation they will they will come abundance there will be or almost everything will get very cheap.
10
11
The harder challenge much harder challenge is how do people then have meaning like a lot of people
12
they find meaning from their employment so if you don't have if you're not needed if
13
there's not a need for your labor how do you what's the meaning if you have meaning
14
if you feel useless these are much that's a much harder problem to deal with.
15
16
''')
Copied!

Text Classification

Basic sentiment analysis on a text. Returns a "label" (negative/neutral/positive), and score between -1 and 1.
1
nlp.get("text-classification",
2
model="distilbert-base-uncased-finetuned-sst-2-english",
3
tokenizer="distilbert-base-uncased-finetuned-sst-2-english")('''
4
5
It was a weird concept. Why would I really need to generate a random paragraph?
6
Could I actually learn something from doing so?
7
All these questions were running through her head as she pressed the generate button.
8
To her surprise, she found what she least expected to see.
9
10
''')
Copied!

Fill Mask

Fill the blanks ('< mask >') in a sentence given with multiple proposals. Each proposal has a score (confidence of accuracy), token value (proposed word in number), token_str (proposed word)
1
nlp.get("fill-mask",
2
model="distilroberta-base",
3
tokenizer="distilroberta-base")('''
4
5
It was a beautiful <mask>.
6
7
''')
Copied!

Feature extraction

This generate a words embedding (extract numbers out of the text data). Output is a list of numerical values.
1
nlp.get("feature-extraction", model="distilbert-base-cased", tokenizer="distilbert-base-cased")("Life is a super cool thing")
Copied!

Token classification

Basically NER. If you give names, location, or any "entity" it can detect it.
Entity abreviation
Description
O
Outside of a named entity
B-MIS
Beginning of a miscellaneous entity right after another miscellaneous entity
I-MIS
Miscellaneous entity
B-PER
Beginning of a person’s name right after another person’s name
I-PER
Person’s name
B-ORG
Beginning of an organization right after another organization
I-ORG
organization
B-LOC
Beginning of a location right after another location
I-LOC
Location
Full documentation : https://huggingface.co/dslim/bert-base-NER.

Output

Display result

1
nlp.get("token-classification", model="dslim/bert-base-NER", tokenizer="dslim/bert-base-NER")('''
2
3
My name is Wolfgang and I live in Berlin
4
5
''')
Copied!
Copy link
Edit on GitHub