Process some documents
Now that you have your access token, you can send documents to Semantria. Since Semantria is asynchronous, you will need to request them as well. Here is a simple script to submit documents for analysis and retrieve them. Note that in this script, we don't take any action to handle failing cases, such as a document length being too long. If such an error occurs this simply exits.
To run this set the environment variable SEMANTRIA_TOKEN to your access token before running the script or replace the assignment (e.g., SEMANTRIA_TOKEN = "" )
# send batch of documents using the default English language template
# we are using a job ID here to enable multiple senders and receivers to not cause issues
curl -X POST 'https://api5.semantria.com/documents/?using=en&job_id=job-1' \
-H 'x-api-version: 5.0' \
-H 'x-app-name: curl_test' \
-H 'Authorization: <AUTH_TOKEN>' \
-H 'Content-type: application/json' \
--data-binary '[{"id": "Sutton_Hoo_Helmet", "text": "Sutton Hoo Helmet is a 2002 sculpture by ...SNIP... mining"}}]'
# poll for docs
curl -X GET 'https://api5.semantria.com/documents/?job_id=job-1&limit=50' \
-H 'x-api-version: 5.0' \
-H 'x-app-name: curl_test' \
-H 'Authorization: <AUTH_TOKEN>' \
-H 'Content-type: application/json'
import json, requests, sys, os, time, uuid
SEMANTRIA_TOKEN = os.getenv('SEMANTRIA_TOKEN')
# Some sample text
initialTexts = [
"Lisa - there's 2 Skinny cow coupons available $5 skinny cow ice cream coupons on special k boxes and Printable FPC from facebook - a teeny tiny cup of ice cream. I printed off 2 (1 from my account and 1 from dh's). I couldn't find them instore and i'm not going to walmart before the 19th. Oh well sounds like i'm not missing much ...lol",
"In Lake Louise - a guided walk for the family with Great Divide Nature Tours rent a canoe on Lake Louise or Moraine Lake go for a hike to the Lake Agnes Tea House. In between Lake Louise and Banff - visit Marble Canyon or Johnson Canyon or both for family friendly short walks. In Banff a picnic at Johnson Lake rent a boat at Lake Minnewanka hike up Tunnel Mountain walk to the Bow Falls and the Fairmont Banff Springs Hotel visit the Banff Park Museum. The \"must-do\" in Banff is a visit to the Banff Gondola and some time spent on Banff Avenue - think candy shops and ice cream.",
"On this day in 1786 - In New York City commercial ice cream was manufactured for the first time.",
]
headers = {'content-type': 'application/json',
'x-api-version': '5.0',
'Authorization': SEMANTRIA_TOKEN}
params = {'using': 'en'}
# Send 'text_list' to Semantria for processing.
def send_batch(text_list):
batch = []
for text in text_list:
doc_id = str(uuid.uuid4())
batch.append({'id': doc_id, 'text': text})
print('Sending batch')
response = requests.post('https://api5.semantria.com/documents/',
json=batch, headers=headers, params=params)
if response.status_code not in [200, 202]:
print('ERROR: problem submitting batch. status: {}, message: {}'.format(
response.status_code, response.text))
sys.exit(response.status_code)
print ("{0} documents queued successfully".format(len(batch)))
# Poll Semantria for analysis results until some are received.
def poll_for_results():
while True:
time.sleep(1)
print('Polling')
response = requests.get('https://api5.semantria.com/documents/', headers=headers)
if response.status_code not in [200]:
print('ERROR: problem polling. status: {}, message: {}'.format(
response.status_code, response.text))
sys.exit(response.status_code)
results = response.json()
if results:
print('Got results:\n')
json.dump(results, sys.stdout, indent=4, sort_keys=True)
print()
break
send_batch(initialTexts)
poll_for_results()
In a real application, you will want to have one job for queueing and another for requesting.
Updated 11 months ago