API¶
The primary Parser class and serialize decorator are available from the models and serializers modules.
>>> from soupstars.models import Parser
>>> from soupstars.serializers import serialize
Those objects are also on the top-level api.
>>> from soupstars import Parser, serialize
Models¶
The primary model provided by soupstars is the Parser class. It should generally be subclassed when building your own parsers.
When you initialize a parser with a url, it automatically downloads the webpage at that url and stores both the request and response as attributes.
>>> from soupstars import Parser
>>> class MyParser(Parser):
... @serialize
... def item(self):
... return 'An item!'
>>> parser = MyParser('https://jsonplaceholder.typicode.com/todos/1')
>>> print(parser.response)
<Response [200]>
>>> print(parser.request)
<PreparedRequest [GET]>
Serializers¶
Serializers help convert parsers into storable objects. The functions defined in this module are used to instruct soupstars about how to perform the serialization.
-
soupstars.serializers.
serialize
(function)[source]¶ Decorating a function defined on a parser with serialize instructs soupstars to include that function’s return value when building its own serialization.
>>> from soupstars import Parser, serialize >>> class MyParser(Parser): ... @serialize ... def length(self): ... return len(self.response.content) ... >>> parser = MyParser('https://jsonplaceholder.typicode.com/todos/1') >>> parser.serializer_names() ['length'] >>> 'length' in parser.to_dict() True