This type is a combination of ISO-639 language codes and ISO 3166 country codes. The language
code is first and is a two character lowercase code. The country code is second and is a two
character upper case code. The country code is optional.
This structure allows for greater flexibility because different countries have drastically
different types and sets of profanity. England and the United States for example both have
english as the prominent language, but both have different words that are considered profanity.
This is a list of types. Briefly each type is:
Alcohol - Any alcohol related words that some applications might want to filter such as
beer, ale, wine, etc.
Drug - Any drug related words/slang that some applications might want to filter such as
pot, weed, bong, etc.
Religion - Currently this category is rather small, but includes words that might offend
certain users who have strong religious beliefs. This includes words and phrases
such as "god damnit", "anti-christ", etc.
Slang - Slang is a very broad category and includes most words that are not considered
direct swears. The slang category includes words such as airhead, bang, bimbo, etc.
Swear - Swear is a very broad category and includes all words that are only considered
direct swears. These words usually have no other meaning than profanity. The Swear
category includes words such as shit, ass, fuck, etc.
Youth - This category contains all the words that are dictionary words and have no slang
meanings but might be deemed inappropriate for younger users. This includes words
such as beastiality, bisexual, homosexual, etc.
This element is the main entry point into the Inversoft Profanity WebService. This controls
the behavior of the WebService as well as controls the results returned by the WebService.
This element has 6 attributes and two elements that are used to control the WebService.
The Elements
------------
The authentication element is used to supply to the WebService the client credentials. This
is required and is used to verify that the caller is valid and that the account for the
customer has not exceeded the monthly limits. This element is has further documentation for
more information.
The text element is how the WebService is given the String to work on. This is the text that
is to be filtered by the WebService. This element is required and should always contain a
CDATA block.
The Attributes
--------------
The tolerance attribute controls how strict or lenient the WebService should be. This
attribute informs the WebService that it should only check the text block for words whose
rating is equal to or greater than the tolerance. Each word that is checked contains by the
WebService contains a rating. The list of words can be purchased from Inversoft as the
Inversoft Bad Word Database if a complete list is required. You can also email Inversoft
to inquiry about the ratings for specific words at info@inversoft.com.
The replacementCharacter is only used if the WebService is being called in REPLACE mode
using the operation attribute. If the operation attribute is REPLACE than this element must
be specified. This character is used by the WebService to replace profanity in the text block.
For example, if the WebService is passed "This fucking website sucks" and a
replacementCharacter of '*' it will return the text "This ******* website sucks". If the
WebService is called in FIND mode, this attribute is ignored.
The profanityTypes attribute is a comma separated list of profanity types to find or replace.
This list uses the enumerated values of the profanityType simpleType. If you want to have
the WebService filter Slang, Swear and Religious words from the text block than this attribute
should be set to a value of "Slang,Swear,Religious".
The locale parameter controls which language the text block is. Currently, the Inversoft
Profanity WebService only works with American English, but in the future other languages
will be supported. This attribute is a combination of a language and country code and the
documentation for that type contains additional information. Currently, this attribute is
ignored and the WebService only finds/replaces American English profanity.
The filterType parameter is used to tune the performance of the WebService. There are
currently two settings, FAST and STANDARD. The FAST filterType informs the WebService that
it should work as quickly as possible. This however could cause complex and tricky instances
of profanity to be missed. The STANDARD filterType informs the WebService to work in standard
mode and find as much profanity as possible. The STANDARD filterType does cause the WebService
to run slower because it most make a more exhaustive scan of the text block. However, the
WebService is fully tuned and often medium sized text blocks of around 200-400 words can be
handled by the WebService and the STANDARD filterType in less than 100 miliseconds. This does
not include network latency.
The operation attribute tells the WebService what type of operation to perform. There are
currently two operations, find and replace. The result of each operation is different and
controls the results in the response XML document that the WebService returns. The FIND
operation informs the WebService to find all instances of profanity in the text block and
a list of matches is returned. These include the word that was matched and the location in
the text (see the response documentation for more information). The REPLACE operation
informs the WebService to replace all profanity in the text block with the
replacementCharacter and return the new text.
The rating is an integer from 1-10 that is used to tell the filter to be more or less
strict. Words that are considered more profane have a higher rating using a higher valued
integer. Words that are less offensive and profane have a lower rating. The rating is
mainly used to tune the performance of the filter while still capturing all the words
that the application wishes to filter. The higher the rating the fewer words that need
to be filtered and the fast the filter will perform.