question

beewey avatar image
0 Likes"
beewey asked

Gzip file retrieve from fetchItemAspects (category_tree_id: 77) seems to be invalid

Hi !
The file i retrieved from fetchItemAspects call (category_tree_id = 77) seems to be invalid.
I retrieved and decompressed aspects from category_tree_id 71 with success.
Here is the steps that i did (for both category trees):
- Dowloaded file using Postman
- decompress downloaded file using gzip command

- ebay_item_aspects_71.json file is correct and works well when i loads it using python. But ebay_item_aspects_77.json is not complete...

Another Use Case :

- When requesting item aspects (category_tree_id 71) using python request lib, everything works fine. But with category_tree_id 77 i got following exception :

Traceback (most recent call last):
  File "/home/gmonacho/projects/python39_env/lib/python3.9/site-packages/urllib3/response.py", line 441, in _error_catcher
    yield
  File "/home/gmonacho/projects/python39_env/lib/python3.9/site-packages/urllib3/response.py", line 770, in read_chunked
    chunk = self._handle_chunk(amt)
  File "/home/gmonacho/projects/python39_env/lib/python3.9/site-packages/urllib3/response.py", line 714, in _handle_chunk
    value = self._fp._safe_read(amt)
  File "/usr/local/lib/python3.9/http/client.py", line 628, in _safe_read
    raise IncompleteRead(b''.join(s), amt)
http.client.IncompleteRead: IncompleteRead(6137 bytes read, 4103 more expected)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/home/gmonacho/projects/python39_env/lib/python3.9/site-packages/requests/models.py", line 816, in generate
    for chunk in self.raw.stream(chunk_size, decode_content=True):
  File "/home/gmonacho/projects/python39_env/lib/python3.9/site-packages/urllib3/response.py", line 575, in stream
    for line in self.read_chunked(amt, decode_content=decode_content):
  File "/home/gmonacho/projects/python39_env/lib/python3.9/site-packages/urllib3/response.py", line 796, in read_chunked
    self._original_response.close()
  File "/usr/local/lib/python3.9/contextlib.py", line 137, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/home/gmonacho/projects/python39_env/lib/python3.9/site-packages/urllib3/response.py", line 458, in _error_catcher
    raise ProtocolError("Connection broken: %r" % e, e)
urllib3.exceptions.ProtocolError: ('Connection broken: IncompleteRead(6137 bytes read, 4103 more expected)', IncompleteRead(6137 bytes read, 4103 more expected))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/code.py", line 90, in runcode
    exec(code, self.locals)
  File "<input>", line 1, in <module>
  File "/home/gmonacho/.local/share/JetBrains/Toolbox/apps/PyCharm-P/ch-0/222.3739.56/plugins/python/helpers/pydev/_pydev_bundle/pydev_umd.py", line 198, in runfile
    pydev_imports.execfile(filename, global_vars, local_vars)  # execute the script
  File "/home/gmonacho/.local/share/JetBrains/Toolbox/apps/PyCharm-P/ch-0/222.3739.56/plugins/python/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "/home/gmonacho/projects/dropix/ruby/scripts/ebay/download_ebay_item_aspects.py", line 33, in <module>
    download_ebay_item_aspects()
  File "/home/gmonacho/projects/dropix/ruby/scripts/ebay/download_ebay_item_aspects.py", line 21, in download_ebay_item_aspects
    response = template.session.get(
  File "/home/gmonacho/projects/python39_env/lib/python3.9/site-packages/requests/sessions.py", line 600, in get
    return self.request("GET", url, **kwargs)
  File "/home/gmonacho/projects/python39_env/lib/python3.9/site-packages/requests/sessions.py", line 587, in request
    resp = self.send(prep, **send_kwargs)
  File "/home/gmonacho/projects/python39_env/lib/python3.9/site-packages/requests/sessions.py", line 745, in send
    r.content
  File "/home/gmonacho/projects/python39_env/lib/python3.9/site-packages/requests/models.py", line 900, in content
    self._content = b"".join(self.iter_content(CONTENT_CHUNK_SIZE)) or b""
  File "/home/gmonacho/projects/python39_env/lib/python3.9/site-packages/requests/models.py", line 819, in generate
    raise ChunkedEncodingError(e)
requests.exceptions.ChunkedEncodingError: ('Connection broken: IncompleteRead(6137 bytes read, 4103 more expected)', IncompleteRead(6137 bytes read, 4103 more expected))

So i think that Deutch itemaspects file is corrupted. does anyone have more information on the subject? Can somebody help me :) ?
Thank you

fetchitemaspectsgzipcategory tree
1663321442819.png (4.9 KiB)
1663321546635.png (29.8 KiB)
· 5
10 |600

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.

i cannot confirm that the result (file) is broken. i downloaded the 77 and parsed it jq like:

cat tree77.json | jq > tree77.parsed.json

which ends up in a 2,4 GB file, but human readable. the jq command didn´t throw any error and the end of the file looks fine, without anything missing.

because you have to deal with extreme large data here, it´s very common that program languages have way too less memory and throw exceptions. so before you try anything with any language, first verify the file is okay - so you know the problem is with the file or with your code.

1 Like 1 ·

Thank you @michab2003 for your help. Could you give me further informations about how you downloaded the gzip file from the Rest API call ?

0 Likes 0 ·
I used php, but to verify the integrity of the data (file) I would say that u should use a neutral way to download the data, like postman.
1 Like 1 ·
Show more comments

0 Answers

·

Write an Answer

Hint: Notify or tag a user in this post by typing @username.

Up to 2 attachments (including images) can be used with a maximum of 512.0 KiB each and 1.0 MiB total.