-
Notifications
You must be signed in to change notification settings - Fork 414
Imperceptible Perturbations support for TextAttack #817
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
vlwk
wants to merge
64
commits into
QData:master
Choose a base branch
from
vlwk:master
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Reorderings, deletions, invisible characters
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
This PR adds the Bad Characters: Imperceptible NLP attack. It introduces a new dimension to attacks: perturbations that are invisible (on some rendering systems). Full details can be found in
textattack.attack_recipes.BadCharacters2021
. It uses a combination of the differential evolution search algorithm, four different transformations (invisible characters, homoglyphs, deletions, reorderings) and various goal functions.Notes:
Additions
Attack recipe:
BadCharacters2021
recipe astextattack.attack_recipes.BadCharacters2021
.Tests:
tests.badcharacters2021
which contains a notebookbadcharacters.ipynb
and a requirements filerequirements.txt
. You may run the entire notebook from start to end. There are flags at the top of the notebook to set whether to save the downloaded temp files or the results, or have them deleted automatically. The perturbation type can also be chosen. Each of the five experiments in the paper are replicated in this notebook, with custom model wrappers and everything.Docs:
sphinx-apidoc -f -o apidoc -d 6 -E -T -M ../textattack
to generate the content inapidoc
. Note: this seemed to make minor modifications to every single file inapidoc
. Not sure if that is intended behaviour.Transformations:
WordSwapDifferentialEvolution
intextattack.transformations
WordSwapInvisibleCharacters
,WordSwapDeletions
,WordSwapReorderings
intextattack.transformations
, which extendWordSwapDifferentialEvolution
.intentional_homoglyphs.txt
totextattack.shared
.WordSwapHomoglyphSwap
was modified.Search methods:
DifferentialEvolution
intextattack.search_methods
Goal functions:
textattack.goal_functions.custom
LogitSum
,NamedEntityRecognition
,TargetedBonus
,TargetedStrict
intextattack.goal_functions.custom
MaximizeLevenshtein
intextattack.goal_functions.text
Goal function results:
textattack.goal_function_results.custom
LogitSumGoalFunctionResult
,NamedEntityRecognitionGoalFunctionResult
,TargetedBonusGoalFunctionResult
,TargetedStrictGoalFunctionResult
intextattack.goal_function_results.custom
Validators:
transformation_consists_of_word_swaps_differential_evolution
inshared.validators
. This is used to check that theDifferentialEvolution
search method is used with a compatible transformation, which must subclassWordSwapDifferentialEvolution
.AttackArgs:
Requirements:
Levenshtein
.Changes
I tried to minimise changes to existing files.
allow_skip
intextattack.goal_functions.GoalFunction
, which defaults to True. When set to False, the attack will still continue even if the initial_result already meets the goal. This was needed to replicate the experiments in the paper.Design choices
textattack.models.wrappers
folder, instead opting to leave them in the notebook attests.badcharacters2021
. This mirrors the tutorials listed on TextAttack's documentation. If the maintainers would like, this notebook can be transferred to the tutorials section.DifferentialEvolution
search method and the variousTransformation
s implemented. It made sense for these to be added to thetextattack.search_methods
andtextattack.transformations
folders._get_transformations
and_get_replacement_words
functionality was insufficient because of two reasons:textattack.transformations.WordSwapDifferentialEvolution
. This class provides a clean interface for my needs. Subclasses are required to implement two methods:get_bounds_and_precomputed(current_text)
Returns the bounds used by Differential Evolution to sample the perturbation vector, and any precomputed data needed to efficiently apply perturbations (e.g., homoglyph maps).
apply_perturbation(current_text, perturbation_vector, precomputed)
Applies a perturbation vector to an input
AttackedText
object and returns the modifiedAttackedText
.get_bounds_and_precomputed
toapply_perturbation
, rather than recalculating it on every call._get_replacement_words
for all of the new transformations.DifferentialEvolution
search method also checks for transformation compatibility. The transformation must be an instance ofWordSwapDifferentialEvolution
.textattack.goal_functions
folder, mainly because I didn't want to make the attack recipe too bloated. At first I placed them under theclassification
andtext
subfolders, but there were some things that didn't match. For example, one of my attacks required as input an array of logits that did not sum to 1, so this required me to override_process_model_outputs
. Another was for a Named Entity Recognition task which output one score/label per input token, instead of one score per input sentence. Most required slightly different_get_score
functions as well.GoalFunctionResult
as well. I could have usedTextToTextGoalFunctionResult
, because I just wanted to overrideget_colored_output
, but decided that wouldn't be fully accurate.Checklist
.rst
file inTextAttack/docs/apidoc
.'