summaryrefslogtreecommitdiff
path: root/guide/concepts.mdwn
blob: 65241fa892276c3dc49eeb8d27d5c3bd282c6056 (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
[[!meta title="The X New Developer’s Guide: X Window System Concepts"]]
# X Window System Concepts
*Alan Coopersmith*

[[!toc levels=3 startlevel=2]]

This chapter aims to introduce you to the basic X Window
System concepts and terminology you will need to
understand. When you have these concepts, you will be ready
to dive deeper into specific topics in later chapters.

<p align="center"><img src="../xorg.svg" width="80%"/></p>

## X Is Client / Server

The X Window System was designed to allow multiple programs
to share access to a common set of hardware. This hardware
includes both input devices such as mice and keyboards, and
output devices: video adapters and the monitors connected to
them. A single process was designated to be the controller
of the hardware, multiplexing access to the applications.
This controller process is called the X server, as it
provides the services of the hardware devices to the client
applications. In essence, the service the Xserver provides
is access, through the keyboard, mouse and display, to the X
user.

Like many client/server systems, the X server typically
provides its service to many simultaneous clients. The X
server runs longer than most of the clients do, and listens
for incoming connections from new clients.

Many users will only ever use X on a standalone laptop or
desktop system. In this setting, the X clients run on the
same computer as the X server. However, X defines a stream
protocol for clients / server communication. This protocol
can be exposed over a network to allow clients to connect to
a server on a different machine.  Unfortunately, in this
model, the client/server labeling can be confusing. You may
have an X server running on the laptop in front of you,
displaying graphics generated by an X client running on a
powerful machine in a remote machine room. For most other
protocols, the laptop would be a client of file sharing,
http or similar services on the powerful remote machine.  In
such cases, it is important to remind yourself that
keyboards and mice connect to the X server. It is also the
one endpoint to which all the clients (terminal windows, web
browsers, document editors) connect.

## X In Practice

This section describes some of the fundamental pieces of X
and how they work. This is one of those places where
everything wants to be presented at once, so the section is
something of a mish-mash. Recommended reading practice is to
skim it all once, and then go back and read it all again.

### Input

As mentioned earlier, the X server primarily handles two
kinds of hardware: input devices and output
devices. Surprisingly, the input handling tends to be the
more difficult and complicated of the two. Input is
multi-source, concurrent, and highly dependent on complex
user preferences.

#### Input via Keyboard

One of the tasks the X server performs is handling typing on
keyboards and sending the corresponding key events to the
appropriate client applications. In a simple X
configuration, one client at a time has the "input focus"
and most key events will go to that client.  Depending on
window manager configuration, focus may be moved to another
window by simply moving the mouse to another window,
clicking the mouse, using a hotkey, or by manipulating a
panel showing available clients. The client with focus is
usually highlighted in some way, so that the user can know
where their input will go. Clients may use "grabs"
(described later in this chapter) to override the default
delivery of key events to the focused client.

There are a wide variety of keyboards in the world. This is
due to differing language requirements, to differing
national standards, and to hardware vendors trying to
differentiate their product. This variety makes the mapping
of key events from hardware "key codes" into text input a
challenging and complex process.  The X server reports a
simple 8-bit keycode in key press and release events. The
server also provides a keyboard mapping from those keycodes
to "KeySyms" representing symbolic labels on keys ("A",
"Enter", "Shift", etc.). Keycodes have no inherent meaning
outside a given session; the same key may generate different
code values on different keyboards, servers, configurations,
or operating systems. KeySym values are globally-assigned
constants, and are thus what most applications should be
concerned with. The X Keyboard (XKB) extension provides
complex configuration and layout handling, as well as
additional key handling functionality that was missing in
the original protocol.  Xlib and toolkits also provide input
methods for higher level input functions, such as compose
key handling or mapping key sequences to complex characters
(for example, Asian language input).

#### Input via Mouse

The X protocol defines an input "pointer" (no relation to
the programming concept). The pointer is represented on
screen by a cursor; it is usually controlled by a mouse or
similar input device. Applications can control the cursor
image.  The core protocol contains simple 2-color cursor
image support. The Render extension provides alpha-blended
32-bit color cursor support; this support is normally
accessed through libXcursor.
       
Pointer devices report motion events and button press and
release events to clients.  The default configuration of the
Xorg server has a single pointer. This pointer aggregates
motion and button events from all pointer-type devices
attached to the server: for example, a laptop's touchpad and
external USB mouse. Users can use the MultiPointer X (MPX)
functionality in Xinput extension 2.0 to enable multiple
cursors and assign devices to each one. With MPX, each
pointer has its own input focus. Each pointer is paired with
keyboards that provide input to the client that has the
input focus for that pointer.
       
#### Input via Touchpad
       
For basic input, a touchpad appears to clients as just
another device for moving the pointer and generating button
events.  Clients who want to go beyond mouse emulation can
use the Xinput extension version 2.2 (shipped with Xorg
1.12) or later to enable support for multitouch event
reporting.
       
#### Input via Touchscreen

[XXX write me --po8]

#### Advanced Input Devices and Techniques

[Make whot write this? or steal from http://who-t.blogspot.com? --alanc]

### GetImage: Reading From the Display

The X server does not keep track of what it has drawn on the
display. Once bits are rendered to the frame buffer, its
responsibility for them has ended. If bits need to be
re-rendered (for example, because they were temporarily
obscured), the X server asks a client---usually either a
compositing manager or the application that originally drew
them---to draw them again.

In some situations, most notably when taking "screenshots",
a client needs to read back the contents of the frame buffer
directly. The X protocol provides a GetImage request for
this case.

GetImage has a number of drawbacks, and should be avoided
unless it is absolutely necessary. GetImage is typically
extremely slow, since the hardware and software paths in
modern graphics are optimized for the case of outputting
pixels at the expense of rendering them. GetImage is also
hard to use properly. Here, more than anywhere else in the X
protocol, the underlying hardware is exposed to clients. The
requested frame buffer contents are presented to the client
with the frame buffer's alignment, padding and byte
ordering. Generic library code is available in Xlib and XCB
to deal with the complexity of translating the received
frame buffer into something useful. However, using this code
further slows processing.

### Output
     
#### Rendering / Rasterization
       
The X protocol originally defined a core set of primitive
rendering operations, such as line drawing, polygon filling,
and copying of image buffers.  These did not evolve as
graphics hardware and operations expected by modern
applications moved on, and are thus now mainly used in
legacy applications.
       
Modern applications use a variety of client side rendering
libraries, such as Cairo for rendering 2D images or OpenGL
for 3D rendering.  These may then push images to the X
server for display, or use DRI to bypass the X server and
interact directly with local video hardware, taking
advantage of GPU acceleration and other hardware features.

##### Polygon Rendering Model
         
### Displays and Screens
         
X divides the resources of a machine into Displays and
Screens.  A Display is typically all the devices connected
to a single X server, and displaying a single session for a
single user.  Systems may have multiple displays, such as
multi-seat setups, or even multiple virtual terminals on a
system console.  Each display has a set of input devices,
and one or more Screens associated with it.  A screen is a
subset of the display across which windows can be displayed
or moved - but windows cannot span across multiple screens
or move from one screen to another.  Input devices can
interact with windows on all screens of an X server, such as
moving the mouse cursor from one screen to another.
Originally each Screen was a single display adaptor with a
single monitor attached, but modern technologies have
allowed multiple devices to be combined into logical screens
or a single device split.
         
When connecting a client to an X server, you must specify
which display to connect to, either via the $DISPLAY
environment variable or an application option such as
-display or --display.  The full DISPLAY syntax is
documented in the X(7) man page, but a typical display
syntax is: hostname:display.screen The "hostname" may be
omitted for local connections, and ".screen" may also be
left off to use the default screen, leaving the minimal
display specification of :display, such as ":0" for the
normal default X server on a machine.
         
### Graphics contexts
       
A graphics context (GC) is a structure to store shared state
and common values for X drawing operations, to avoid having
to resend the same parameters with each request.  Clients
can allocate additional graphics contexts as necessary to be
able to specify different values by setting up a separate GC
for each set of values and then just specifying the
appropriate GC for each operation.
       
### Colors (really?) and Visuals
       
X is so old that when it was designed most users had
monochrome displays, with just black and white pixels to
choose from, and even then hardware manufacturers couldn't
agree which was 0 and which was 1.  Those who spent an extra
thousand dollars more would have 4 or 8 bit color, allowing
pixels to be chosen from a pallette of up to 256 colors.
But now it's 2012, and anyone without 32-bits of color data
per pixel is a luddite.  Still, a lot of complexity remains
here that someone should explain...
       
### Syncing and Flushing connections
       
As described in the Communication chapter, the X protocol
tries to avoid latency by doing as much asynchronously as
possible.  This is especially noticed by new programmers who
call rendering functions and then wonder why they got no
errors but did not see the expected output appear.  Since
drawing operations do not require waiting for a response
from the X server, they are just placed in the clients
outgoing request buffer and not sent to the X server until
something causes the buffer to be flushed.  The buffer will
be automatically flushed when filled, but it takes a lot of
commands to fill the default 32kb buffer size in Xlib.  Xlib
and XCB will flush the buffer when a function is called that
blocks waiting for a response from the server (though which
functions those are differ between the two due to the
different design models - see the Xlib and XCB chapter for
details).  Lastly, clients can specifically call XFlush() in
Xlib or xcb_flush() in XCB to send all the queued requests
from the buffer to the server.  To both flush the buffer and
wait for the X server to finish processing all the requests
in the buffer, clients can call XSync() in Xlib or
xcb_aux_sync() in XCB.

### Window System Objects

A variety of objects are used by X.

#### Windows
       
In X, a window is simply a region of the screen into which
drawing can occur.  Windows are placed in a tree hierarchy,
with the root window being a server created window that
covers the entire screen surface and which lives for the
life of the server. All other windows are children of either
the root window or another window.  The UI elements that
most users think of as windows are just one level of the
window hierarchy.

At each level of the hierarchy, windows have a stacking
order, controlling which portions of windows can be seen
when sibling windows overlap each other.  Clients can
register for Visibility notifications to get an event
whenever a window becomes more or less visible than it
previously was, which they may use to optimize to only draw
the visible portions of the window.

Clients running in traditional X environments will also
receive Expose events when a portion of their window is
uncovered and needs to be drawn because the X server does
not know what contents were there.  When the composite
extension is active, clients will normally not receive
expose events since composite puts the contents of each
window in a separate, non-overlapped offscreen buffer, and
then combines the visible segments of each window onscreen
for display.  Since clients cannot control when they will be
used in a composited vs. legacy environment, they must still
be prepared to handle Expose events on windows when they
occur.
       

#### Pixmaps
       
A pixmap, like a window, is a region into which drawing can
occur.  Unlike windows, pixmaps are not part of a hierarchy
and are not displayed on screen directly.  Pixmap contents
may be copied to windows for display, either directly via
requests such as CopyArea, or automatically by setting a
Window's background to be a given pixmap.  Pixmaps may be
stored in system memory, video memory on a graphics adaptor,
or shared memory accessible by both client and server.  A
given pixmap may be moved back and forth between system and
video memory as needed to maintain a good cache of recently
accessed pixmaps in faster access video RAM.  Using the
MIT-SHM extension to store a pixmap in shared memory may
allow the client to push updates faster, by operating
directly on the shared memory region instead of having to
copy the data through a socket to the server, but it may
also prevent the server from moving the pixmap into the
cache in video ram, making copies to a window on the screen
slower.
     
#### Widgets
     
Applications need more than windows and pixmaps to provide a
user interface - users expect to see menus, buttons, text
fields, menus, etc. in their windows.  These user interface
elements are collectively called widgets in most
environments.  X does not actually provide any widgets in
the core protocol or libraries, only the building blocks
such as rendering methods and input events for them to be
built with.  Toolkits such as Qt and GTK+ provide a common
set of widgets for applications to build with, and a rich
set of functionality to provide good support for a wide
range of uses and users, including those who read different
languages or need accesisbility technology in order to use
your application.  Some toolkits have utilized all the
infrastructure X provides around window stacking and
positioning by making each widget a separate window, but
most modern toolkits do this management client side now
instead of pushing it to the X server.
            
#### XIDs
       
Many resources managed by the server are assigned a 32-bit
identification number, called an XID, from a server-wide
namespace.  Each client is assigned a range of identifiers
when it first connects to the X server, and whenever it
sends a request to create a new Window, Pixmap, Cursor or
other XID-labeled resource, the client (usually
transparently in Xlib or xcb libraries) picks an unused XID
from it's range and includes it in the request to the server
to identify the object created by this request.  This allows
further requests operating on the new resource to be sent to
the server without having to wait for it to process the
creation request and return an identifier assignment.  Since
the namespace is global to the Xserver, clients can
reference XID's from other clients in some contexts, such as
moving a window belonging to another client.
       
#### Atoms
       
In order to reduce the retranmission of common strings in
the X protocol, a simple lookup table mechanism is used.
Entries in this table are known as Atoms, and have an
integer key that is passed in most protocol operations
requiring them, and a text string that can be retrieved as
needed.  The InternAtom operation searches finds the Atom id
number for a given string, and can optionally add the string
to the table and return a new id if it's not already found.
The GetAtomName returns the string for a given atom id
number.  Atoms are used in a wide variety of requests and
events, but have a unique namespace across all operations
and clients of a given X server.
       
#### Properties
       
A common design pattern in X for providing extensible
metadata is the Property mechanism.  A property is a key
value pair, where the key is a text string, represented as
an X atom, and the value is a typed value, which may also be
an atom, an integer, or some other type.  The core protocol
provides properties on windows and fonts.  The Xinput
extension adds properties to input devices, while the Xrandr
extension adds properties to output devices.

X itself does not assign any meaning or purpose to window
properties.  However conventions have been established for
many window properties to provide metadata that is useful
for window and session management.  The initial set of
properties is defined in the X Inter-Client Communication
Conventions Manual (ICCCM), which may be found at
<http://www.x.org/releases/current/doc/>.  This initial set
was later extended by groups working on common functionality
for modern desktop environments at freedesktop.org, which
became the Extended Window Manager Hints (EWMH)
specification, found at
<http://www.freedesktop.org/wiki/Specifications/wm-spec>.

### Grabs
       
Grabs in X provide locking and reservation capabilities.
"Active Grabs" take exclusive control of a given resource
immediately and lock out all other clients until the grab is
released.  "Passive grabs" place a reservation on a
resource, causing an active grab to be triggered at a later
time, when an event occurs, such as a keypress.  These can
be used for instance, to have a hotkey that goes to a
certain application regardless of which application
currently has input focus.
       
One of the available grabs is the Server Grab. A client who
grabs the server locks out all other clients, preventing any
other application from being able to update the display or
interact with the user until the server grab is released.
This should be released as soon as possible, since besides
annoying users when they can't switch to another program, it
may also cause security problems, since the screen lock is
just another client and will be locked out with the rest.
       
The other primary form of grab is on an input device or
event.  Clients can actively grab the keyboard or mouse to
force getting all input from a device, even if the cursor
moves outside the application's window.  Passive grabs can
be placed on specific input events, such as a particular
keypress event or mouse button event, causing a primary grab
to automatically occur for that client when the event
happens.
       
More information can be found in
<http://who-t.blogspot.com/2010/11/high-level-overview-of-grabs.html>.

### Selections, Cut-Copy-Paste
       
[copy-and-paste from
<http://keithp.com/~keithp/talks/selection.ps> and other docs
on <http://www.x.org/wiki/CutAndPaste> ? ]

<nav>
 <div style="border-top: 1px solid black; text-align: center;">
  <a href="../" rel="contents">The X New Developer’s Guide</a><br />
  <a href="../preface"  title="The X New Developer’s Guide: Preface" rel="prev">&lt;&lt; Preface</a>
  |
  <a href="../communication"   title="The X New Developer’s Guide: Communication Between Client and Server" rel="next">Communication Between Client and Server &gt;&gt;</a>
 </div>
</nav>

[[!meta  link="../" rel="contents"]]
[[!meta  link="../preface" rel="prev"]]
[[!meta  link="../communication" rel="next"]]