Today in a particular patent application I uploaded two PDFs in Patentcenter. The Ack Receipt says I uploaded five PDFs (which is not true). When I try to match up the three non-existent PDFs that the Ack Receipt says I uploaded (which I did not) with the three supposedly corresponding PDFs in IFW, the file names do not match and the SHA-512 hashes (message digests) do not match and the file sizes do not match. Nothing about this part of the Ack Receipt is true.
This is really discouraging. If you can’t trust an Ack Receipt, then the entire e-filing system cannot be trusted.
What happened is that I uploaded a PDF file (named “P187092US00.pdf“) that contains a specification, claims, and abstract. Following USPTO procedure, I indicated the starting and ending page numbers for each of the three sections of the PDF file. I also uploaded an Express Request for immediate national-phase processing (“express-request.pdf“). The total number of PDF files that I uploaded was two.
What the Ack Receipt should have said was that I uploaded two PDF files. Instead, the Ack Receipt said this:
Documents Total documents: 4
DOCUMENT | PAGES | DESCRIPTION | SIZE (KB) |
P187092US00.pdf | 31 | – | 141 KB |
P187092US00-SPEC.pdf (1-24) | 24 | Specification | 124 KB |
P187092US00-CLM.pdf (25-30) | 6 | Claims | 97 KB |
P187092US00-ABST.pdf (31-31) | 1 | Abstract | 84 KB |
express-request.pdf | 1 | Transmittal Letter | 36 KB |
I did not upload any file with the name P187092US00-SPEC.pdf. I did not upload any file with the name P187092US00-CLM.pdf. I did not upload any file with the name P187092US00-ABST.pdf.
What the Ack Receipt should have displayed is two SHA-512 hashes (two message digests). Instead, the Ack Receipt said this:
DOCUMENT | MESSAGE DIGEST (SHA-512) |
P187092US00.pdf | DA33E61CEDC7D9B67B013DFFC0A2EA6A6A23293EADCBBD6648675B7BF8D32B9400C14CFE2E4EA3796291A9876C074F391BAD19C46E391B855BB85305C14CB541 |
P187092US00-SPEC.pdf | DA70792AD3A21DD0FA0935AFE586F58E4AA64D7569C6810A389754B1179FD2241F5D9E7A976128B70B889C912DDC255E66E3C42EF26269D1B64C699F570F7EE8 |
P187092US00-CLM.pdf | 10273D5D15170C17D67AD8EBACB739BB83F69AF87D0A543E1C5404C73A81060755F8310092C87EFF98FF985FD85C3487C0BDB7495665FDF80545E0C46B1C3F1A |
P187092US00-ABST.pdf | 1B03E98D66C5A1A9FA7436EA902D318D6EF1686AB9B050737336C4B91EE2589FC85AEC8FB8F2CE54C8BB368A0430834F63789FC19B22362F7995ADE3AB40D0BB |
express-request.pdf | 80B5B5C91B0AA38E88872B7CE035CC25DB29F13C14EE7E97DA1059732CB4341B8B53A47523835F37C28D049B27CB1F36D19A73A528DBD8281E9551AD66F0736E |
There is no file anywhere in IFW (or in SCORE) with a message digest matching any of these five message digests. I do have on my own computer the file P187092US00.pdf and its SHA-512 hash does match the hash listed in the Ack Receipt. I do have on my own computer the file express-request.pdf and its SHA-512 hash does match the hash listed in the Ack Receipt.
But if you were to look in IFW to try to find files that supposedly correspond to any of these five files, you not succeed. The file called “express-request.pdf” is renamed 184378_17754023_03-25-2022_TRAN.LET.PDF in IFW and its SHA-512 hash is very different:
83e9385910c4e58c22d1999a89624849fdb570fb0e937b09be08b8fa3c75d5c8c87d4875b72a4213159e038996cae84c643e25b3c4cdb504d4928c8e1525660b
The file size for the Express Request has also changed. When uploaded it was 36 KB as shown in the Ack Receipt. When downloaded again from IFW its file size is 10 KB.
There is a “claims” file but its name when downloaded from IFW is very different — 184378_17754023_03-25-2022_CLM.PDF. Its SHA-512 hash when downloaded from IFW is also very different:
d4fc82bcf6224ba6895fe93f28c6436256bebcd74395c248a016e1792b910e40176eb3f57f075f112ec17b5a41790cdf5dd4ed69b5bf8e07789082f552823637
The size of the CLM file from IFW is 220 KB which is wildly different from the size of the CLM file imagined by the Ack Receipt (a mere 97 KB).
Note as well the bafflingly large stated file sizes for the three non-existent files P187092US00-SPEC.pdf and P187092US00-CLM.pdf and P187092US00-ABST.pdf. From context one more or less assumes that the USPTO is saying that the way these three files came into existence is that the USPTO had split up the file P187092US00.pdf into three smaller files. But look at the file sizes. The file P187092US00.pdf has a size of 141 KB. That is a number that I am able to objectively confirm because that file is on the hard drive of my computer. Then add up the purported sizes of the three supposedly smaller files. The sum is 305 KB. One wonders how the three supposedly smaller files add up to more than twice the size of the source file.
As best I can discern, the files named P187092US00-SPEC.pdf and P187092US00-CLM.pdf and P187092US00-ABST.pdf never actually existed. I never created such files. No such files ever existed in my computer. No files with those names appear anywhere in IFW. No files with those names appear anywhere in SCORE. No files with those SHA-512 hashes exist anywhere on my computer. No files with those SHA-512 hashes exist anywhere in IFW. No files with those SHA-512 hashes exist anywhere in SCORE. No files with those file sizes exist anywhere on my computer. No files with those file sizes exist anywhere in IFW. No files with those file sizes exist anywhere in SCORE.
Notwithstanding all of this, the Ack Receipt says:
This Acknowledgement Receipt evidences receipt on the noted date by the USPTO of the indicated documents, characterized by the applicant, and including page counts, where applicable. It serves as evidence of receipt similar to a Post Card, as described in MPEP 503.
This is simply false.
I truly admire all the work you do to enlighten the IP community
Apparently, then, the message digest problem isn’t limited to the DOCX fiasco, but is a greater problem that affects everyone, potentially. If I had to make a wild guess, I’d bet that PTO is doing some metadata removal after generating the message digest and before storing the modified file in the IFW, although I would have expected that the metadata removal would have been done before splitting the application into the spec, claims, and abstract pieces. But if the IFW copy doesn’t match the uploaded copy indicated by the message digest, the message digest is not only false, it’s dangerous.
If you split the application file into the three pieces and do a SHA-512 on each piece, does the hash match what’s in your ack receipt?
Yes I wondered about that. I figure that it’s futile to try. The chance that I would split it into three pieces, in just exactly the same way that they did it, seems vanishingly small. Do I embed all of the fonts in PDF number 2 even if only some of the fonts were actually used in PDF number 2? What about the metadata such as the author or the PDF meta-page-number, do I keep that or strip it out? Do I flatten it if it is multilayered, or do I allow the layers to remain in place? And to get a real sense of how futile it surely must be to try to match how they had done their splitting into three pieces, look at their file sizes. Their three files add up to 305 KB. My original PDF was only 141 KB. How in the world could they possibly have split up 141 KB into three parts and ended up with three files which together add up to more than twice that size?
“Their three files add up to 305 KB. My original PDF was only 141 KB. How in the world could they possibly have split up 141 KB into three parts and ended up with three files which together add up to more than twice that size?”
The only answer that I can think of to that probably rhetorical question is that a certain piece of the original pdf, lets say font information – but I haven’t the faintest idea, I just use these programs not understand them, needs to be retained in each of the three new pdfs; and that piece is large enough that the three new pdfs together are more than twice the size of the original one pdf.
But this doesn’t negate your original point as I see it, which is that the “receipt” is not a receipt for what you sent, but rather a message saying “this is what we’ve done with what you filed”. Definitely not good.
At least back when you shipped in paper and got a postcard back, you had a reasonable argument that if the filing receipt, say, indicated something different, you could point to the postcard and say “I sent x, you agreed it was x when you stamped and returned the postcard, don’t tell me now that it isn’t”. But we all know of situations where that happened, too; it’s just that the process has gone from merely less than transparent to completely opaque.
Full disclosure: I am an IP and tech attorney, but not a patent attorney. But I have been a geek for a long time, and have a question about the benefit of retaining a hash of the original file prior to uploading. The PTO says in relevant part: “The submission of a DOCX file generates a unique hash based on the content within the file. The algorithm is similar to what is currently in EFS-Web for PDF submissions and confirms the DOCX file cannot be changed post-submission.” However, they also state that they automatically remove metadata (https://www.uspto.gov/patents/docx) during the submission process. Therefore, my guess is that the hash of the file you create will not match the hash of the file received/retained by the PTO.
I couldn’t find an answer to this, though I read your (always excellent) articles and did a web search. Any thoughts?