Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Solution for examining the DOM… #58

Open
navid-zamani opened this issue Aug 6, 2023 · 0 comments
Open

Solution for examining the DOM… #58

navid-zamani opened this issue Aug 6, 2023 · 0 comments

Comments

@navid-zamani
Copy link

Hey, in the README you write:

It is still possible for someone to write a bot to exploit a single site by closely examining the DOM. This means that if you are Yahoo, Google or Facebook, negative captchas will not be a complete solution. But if you have a small application, negative captchas will likely be a very, very good solution for you. There are no easy work-arounds to this quite yet. Let me know if you have one.


I thought of polymorphic viruses from the 90s right away. Maybe that can help.

It goes like this: Most of the code can be encrypted/scrambled with a different key each time. But the de-scrambler at the start of the code couldn’t. So it was made of small snippets that by themselves were benign and found everywhere in normal code, and weren’t uniquely identifiable either (compare: paper snippets out of a shredder), but were put together in a different random fashion every time.
So a detecting algorithm had nothing to identify. Because the magic was in the sum of the benign parts. But that sum had enough variability, for it to be unfeasible to identify all the ways they could be put together to be effective in doing what the author intended. (Basically, like with a hash, there were too many permutations of choices to reverse-engineer it.) Also, the jumps could be made in many different ways. (Compare: Feynman diagrams.) Including indirection, function calls, returns, jumps parametrized by values depending on where the last jump came from, etc.
Those viruses were rare, because they were hard to make. But when they were done well, they were pretty much unstoppable.
One way to make it easier, was to just encrypt most of the virus, and merely apply polymorphism to the decryptor.

For captchas and forms, the entire structure of the form (HTML and DOM) and JS could be scrambled each time, and made from benign markup and benign pieces of JS. (JS lends itself especially well to this sort of trickery. Like say with doing crazy casts.)
If the CSS that places the input fields at the correct places would then be generated by JS, that was itself encrypted, and only decrypted by code that itself was polymorphically scrambled, I don’t see even a good neural net being able to do that without running a full browser engine and accessing the form via pixels.


Oh, and one could of course just implement the form as a renderer that draws into a canvas! No form fields at all. Just some JS (or WASM!) code that draws a form, takes keyboard input, does some magic with code that even a human can’t understand, and generates a HTTP POST of one single encrypted BLOB, for the server side to unwrap.
For a human, this would look exactly like a bog-standard form. For a computer it’s just a bunch of pixels with unknown permutations of styles, a interface that takes key strokes with no explanation what it does with them, a piece of code it can execute but not understand (and neither can the bot writer), and a binary blob going out.
But I have a feeling that image recognition would not make that any harder to solve for a machine.


A last unorthodox way to solve the problem, is to stop assuming there is such a thing as a non-bot human, and simply enable all humans to use bots. Because if everybody has an advantage, then nobody does.
With that view, one can design the whole thing to be made for bots right away, and not try to win a war that can never be won.
For example, one would not try to design the site securely. One would design the server interface securely. First some rate limiting. Then weeding out the input by what one wants to see, regardless of if it came from a human. (Because let’s be real: When does it start becoming a bot? There’s a computer in-between in any case. Is my browser’s auto-fill function a bot? Is my programmable keyboard a bot? Are my hand’s reflexes a bot? Is a passive-thinking peopleoid that blindly parrots what its leader says a bot? … There’s also a considerable overlap between the smartest machine and the dumbest human(oid).)

In the end, if something submits a valid request by the rules of the site owner, … does it matter if it is a machine?
(Like, say, a self-driving car reporting vandalism in a parking lot that prevents it from parking.)


Maybe I could spark some inspiration. :)

P.S.: Am I a bot? XD

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant