Blame view

vendor/ezyang/htmlpurifier/docs/dev-config-naming.txt 4.86 KB
abf1649b   andryeyev   Чистая установка ...
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
  Configuration naming
  
  HTML Purifier 4.0.0 features a new configuration naming system that
  allows arbitrary nesting of namespaces.  While there are certain cases
  in which using two namespaces is obviously better (the canonical example
  is where we were using AutoFormatParam to contain directives for AutoFormat
  parameters), it is unclear whether or not a general migration to highly
  namespaced directives is a good idea or not.
  
  == Case studies ==
  
  === Attr.* ===
  
  We have a dead duck HTML.Attr.Name.UseCDATA which migrated before we decided
  to think this out thoroughly.
  
  We currently have a large number of directives in the Attr.* namespace.
  These directives tweak the behavior of some HTML attributes.  They have
  the properties:
  
  * While they apply to only one attribute at a time, the attribute can
    span over multiple elements (not necessarily all attributes, either).
    The information of which elements it impacts is either omitted or
    informally stated (EnableID applies to all elements, DefaultImageAlt
    applies to <img> tags, AllowedRev doesn't say but only applies to a tags).
  
  * There is a certain degree of clustering that could be applied, especially
    to the ID directives.  The clustering could be done with respect to
    what element/attribute was used, i.e.
  
      *.id -> EnableID, IDBlacklistRegexp, IDBlacklist, IDPrefixLocal, IDPrefix
      img.src -> DefaultInvalidImage
      img.alt -> DefaultImageAlt, DefaultInvalidImageAlt
      bdo.dir -> DefaultTextDir
      a.rel -> AllowedRel
      a.rev -> AllowedRev
      a.target -> AllowedFrameTargets
      a.name -> Name.UseCDATA
  
  * The directives often reference generic attribute types that were specified
    in the DTD/specification.  However, some of the behavior specifically relies
    on the fact that other use cases of the attribute are not, at current,
    supported by HTML Purifier.
  
      AllowedRel, AllowedRev -> heavily <a> specific; if <link> ends up being
          allowed, we will also have to give users specificity there (we also
          want to preserve generality) DTD %Linktypes, HTML5 distinguishes
          between <link> and <a>/<area>
      AllowedFrameTargets -> heavily <a> specific, but also used by <area>
          and <form>. Transitional DTD %FrameTarget, not present in strict,
          HTML5 calls them "browsing contexts"
      Default*Image* -> as a default parameter, is almost entirely exlcusive
          to <img>
      EnableID -> global attribute
      Name.UseCDATA -> heavily <a> specific, but has heavy other usage by
          many things
  
  == AutoFormat.* ==
  
  These have the fairly normal pluggable architecture that lends itself to
  large amounts of namespaces (pluggability may be the key to figuring
  out when gratuitous namespacing is good.)  Properties:
  
  * Boolean directives are fair game for being namespaced: for example,
    RemoveEmpty.RemoveNbsp triggers RemoveEmpty.RemoveNbsp.Exceptions,
    the latter of which only makes sense when RemoveEmpty.RemoveNbsp
    is set to true. (The same applies to RemoveNbsp too)
  
  The AutoFormat string is a bit long, but is the only bit of repeated
  context.
  
  == Core.* ==
  
  Core is the potpourri of directives, mostly regarding some minor behavioral
  tweaks for HTML handling abilities.
  
      AggressivelyFixLt
      ConvertDocumentToFragment
      DirectLexLineNumberSyncInterval
      LexerImpl
      MaintainLineNumbers
          Lexer
      CollectErrors
      Language
          Error handling (Language is ostensibly a little more general, but
          it's only used for error handling right now)
      ColorKeywords
          CSS and HTML
      Encoding
      EscapeNonASCIICharacters
          Character encoding
      EscapeInvalidChildren
      EscapeInvalidTags
      HiddenElements
      RemoveInvalidImg
          Lexing/Output
      RemoveScriptContents
          Deprecated
  
  == HTML.* ==
  
      AllowedAttributes
      AllowedElements
      AllowedModules
      Allowed
      ForbiddenAttributes
      ForbiddenElements
          Element set tuning
      BlockWrapper
          Child def advanced twiddle
      CoreModules
      CustomDoctype
          Advanced HTMLModuleManager twiddles
      DefinitionID
      DefinitionRev
          Caching
      Doctype
      Parent
      Strict
      XHTML
          Global environment
      MaxImgLength
          Attribute twiddle? (applies to two attributes)
      Proprietary
      SafeEmbed
      SafeObject
      Trusted
          Extra functionality/tagsets
      TidyAdd
      TidyLevel
      TidyRemove
          Tidy
  
  == Output.* ==
  
  These directly affect the output of Generator. These are all advanced
  twiddles.
  
  == URI.* ==
  
      AllowedSchemes
      OverrideAllowedSchemes
          Scheme tuning
      Base
      DefaultScheme
      Host
          Global environment
      DefinitionID
      DefinitionRev
          Caching
      DisableExternalResources
      DisableExternal
      DisableResources
      Disable
          Contextual/authority tuning
      HostBlacklist
          Authority tuning
      MakeAbsolute
      MungeResources
      MungeSecretKey
      Munge
          Transformation behavior (munge can be grouped)