The Semiotics and Conventions of Television

How do we know how to "read" television?

Everyone remembers learning to read a book, but when did you learn to read television or film? People are not born with the ability to understand a television show or film. Visual media contain languages with syntax and "punctuation," as if were written in English. Viewers learn this system of encoding, just as they learned to read.

An argument could be made that reading television is more difficult than reading a book. An understanding of English grammar, punctuation, syntax, and vocabulary are necessary to read the latest Tom Clancy novel. Reading is no mean feat as anyone who has picked up William Faulkner for the first time can tell you. Television and film, however, bring together a multitude of languages presented to us in sound, visuals, motion, and color. The following is not meant to be comprehensive list of all visual languages, which become conventions because they are generally recognized by producers and students of film. Many are discussed in detail in Zettl or Metz.

(1.) Within the camera refers to any choices selected during the process of shooting that is controlled strictly by changing how the film is exposed or video recorded.

(A.) Framing is how the image is arranged within the camera.

(1.) Over the shoulder shot is framed with the camera looking over the shoulder of one person and usually into the face of a second person. Usually the camera is on the face of the person talking so that the speaker is talking to the audience members. It also can used to show a reaction shot to what the speaker says. This shot also indicates the physical distance between two people.-

(2.) Extreme close up shows only the head of the person or part of the face. This shot can also be used to show important objects in great detail. The ECU can assign importance to the person being shot or provide intimacy. ECU is used in news when someone is crying to show the audience the emotion of the person. Can be used in a love scene to indicate the intimate nature of the relationship since the camera shows the face at kissing distance.


(3.) Close up is the head shot or shoulders up. It lets the viewer to enter the personal space of the person, usually the speaker.

(4.) Medium shot is a neutral shot, showing the person from chest up or waist up. The camera distance reflects the usual distance of two people talking in our culture.

medium shot

(5.) Long shot shows the person from at least the knees up and places the person within the surrounding context. Frequently used to establish the scene before going to one of the other scenes. Can also be used to indicate a great separation between two people.

(6.) Macro is a lens or lens setting used to make small objections large enough to see, such as an ant on a nature show. Could be used to create an ECU on the face of watch being shown as an insert edit.
(7.) Asymmetry creates tension or mental stress for the viewer when the camera is tilted so that lines no longer are parallel for the viewer.

(B.) Angle is the position of the camera lens in relationship to the person being photographed.

(1.) Low angle is putting the lens below the eye level of the person and shooting up. This assigns power to the individual. This is frequently done when the hero is photographed or a character presents the preferred ideological position of the film.

low angle

(2.) High angle is putting the lens above the eye level of the person and shooting down. This assigns weakness to the individual. Low angle is one way of victimizing a character.

High angle

(3.) Eye line is shooting on the eye line of the character. This is a neutral position, essentially showing the character as if an audience member was engaging the character.


(C.) Vectors are lines of motion within the frame that tend to direct the viewer's attention in a particularly direction. If a person on the left side of the frame extends an arm and points to the right, viewers will tend to follow the line of motion of the arm and pointed finger. Among the devices used to create vectors are arms and legs, gaze, physical objects with strong lines, such as an open car door beckoning you into the front seat of the car, and lighting.


(D.) Axis is the plane along which the scene is shot.

(1.) X axis is a camera shot on the horizontal axis. This means that the emphasis of the shot is from the left frame to the right. Most television shows are shot on the x axis since the screen only holds two or three people comfortably. The most common TV shot is one person on the left and one on the right and the action runs on a horizontal line. -

(2.) Y axis is a camera shot on the vertical axis. This means that the emphasis of the shot is from the top of the frame to the bottom. Very few shots are made on the y axis, which means that the action runs on the vertical line, since both television and film are horizontal media.-



(3.) Z axis means the scene is shot on a three-dimensional plane. M.A.S.H. and Hill Street Blues were often shot on the z axis because they had an assemble caste. By shooting on a z axis, the action in the foreground could be placed in relationship to the actions of other characters in the background.

(2.) Lens selection, selective focus and zooms.

(A.) Wide angle shot shows the action in context to the setting. It also be used to create distortion when used up close on an object or person.

(B.) Telephoto shot will tend to separate the image from the background by putting all of the attention on one plane. As in the Graduate, the scene where Ben runs passed the telephone poles, a telephoto shot also can used to distort distance, compressing long distances into a small amount of visual space.
(C.) Macro shots can show very small detail, even detail too small to see with the human eye. Since this is an image than people generally do not see (literally and figuratively), people will tend to pay attention to macro images.


(D.) The area in focus is generally what the audience will see, ignoring images out of focus. Racking focus in or out changes the direction of the audience's attention. For example, in TopGun, one scene shows Tom Cruise in focus in the foreground, then the focus is racketed out to bring Van Kilmer into focus, re-enforcing visually the competition between the two.
(E.) Zooming in or out changes both the focus and, in effect, the lens, permitting combinations of focus shifting and context placement as described under shots and focus. In addition, zooming can create a sense of speed when the zoom is very fast or unusually slow.

(3.) Depth of field refers to the area in focus. A shallow depth of field will concentrate attention on one area of the viewing screen. Deep focus means an infinite area is in focus, such as in a panorama. F-stop or the lens aperture controls depth of field. Since recording speed is usually a fixed number, illuminating a large area may be the only to create a large depth of field in film or video.

depth of field

(4.) Editing is the process, according to film theorist Sergei Eisenstein, of linking two separate ideas together, creating a new meaning through the edit. One shot shows the boy, shot two shows the girl, and by editing them together, the film editor has created a personal relationship, even if in reality the two never met.

(A.) Cross cutting is described in the previous paragraph. By cutting from a shot of one person to a shot of another person or from a person to an object, a relationship between the two is created. (B.) Eye line cut is when a person looks in a particular direction, then the film cuts to another shot. The assumption is that the person is looking at whatever is shown in the follow shot.

(C.) Montage editing is creating a relationship between two ideas. If one shot shows a rich man eating and the follow shot is of a poor baby starving, the conclusion might be drawn that the rich man is taking food from the mouth of the baby. The most famous example of montage editing is in Eisenstein's Battleship Potekim, particularly the baby carriage scene where the cruelty of the Czarist troops is contrasted with the anguish of the people being shot and the innocence of the baby in the carriage bouncing down a flight of stairs. (D.) Editing pace and camera run refer to how long a shot runs before there is an edit. A movie like Speed has a very fast editing pace, meaning there are many shots edited together within a one-minute span. A Saturday morning children's hour advertisement will often have 45 cuts in 30 seconds. A European film will seem to drag because the camera will be allowed to record continuously for a minute or more without an edit. A fast editing pace makes the action seem to pass quickly while a long camera run makes time seem to be more slowly.

(E.) Scene, sequence, and shot are three different ways of putting film together on the basis of time and space. The shot is a continuous run of the camera. Scene is two or more shots without a break in time or space. Sequence is two or more shots with a break in time or space. In the scene, the audience is a witness to the action the whole time without a change in location. In the sequence, the audience accepts ellipses. In a scene, we would watch as a man on the couch, he stands, he walks to the television, he turns over to the TV, he walks to the table, he picks up keys, he walks to front door, he opens the front door, he walks out door, he closes door, he walks down the sidewalk to the car, he puts a key in car door and unlocks it. In a sequence, we could short cut that to: shot one of man on couch, shot two of man turning off TV, shot three of closing the door, shot four of opening car door. (F.) Insert edit is a shot inserted into the narrative flow. The woman is watching TV, cut to an insert edit of the clock, cut back to the woman watching TV. The shot of the clock is an insert edit.
(G.) Jump cut is an unexplained break in time and space. Two examples of jump cuts are many MTV videos and Field of Dreams. A music video might show the band running down the alley with the crowd in pursuit and then in the next shot show them running up the alley with the crowd in pursuit without an explanation of how everyone got turned around. In Field of Dreams Kevin Costner is driving to Boston to meet James Earl Jones and he practices what he is going to say to the Jones' character to convince him to accompany him to a baseball game. The audience sees Costner practicing line after line in a series of shots that follow one another without explaining the lost time.

(H.) Fades, dissolves, wipes, cuts, and superimposes. Straight cuts are not the only one to advance from one shot to another. A fade to black and then returning to image has generally represented a passage of time. Dissolves indicate a closer relationship between shots than a cut. Superimposes show the closest relationship between shots, frequently that a person is thinking about what has been superimposed over the person's image. Wipes vary in meaning too much to discuss them except in general as a convention.


5. Sound

(A.) Diagetic sound is that which is heard by the actors.
(1.) Dialogue is the script; the spoken words.

(a.) Pitch is how high or low on the scale the words are spoken in. A recent Domino's Pizza ad used pitch as a technique. Boys (high pitch) that ate the deep dish pizza sounded like men (low pitch).

(b.) Tone is the use of voice emphasis to express love, anger, or other human emotions.

(c.) Delivery is the style which the dialogue is delivered. George Burns was an excellent comic straight man because he knew whether to pause after one of Gracie Allen's statements or repeat it. British stage actors (Richard Burton, Peter O'Toole, Sir Lawrence Olivia, Anthony Hopkins) had the delivery of classical stage training.

(d.) Accent is the regional way words are spoken, identifying for us the person's heritage. In The Clinic, the producers selected a boy with a rural Tennessee accent (harsh and uneducated) because that was supposed to be the boy's background.

(2.) Sound effects are background noises heard (breaking glass to jet engines) to add realism.

(3.) Music is sometimes played within a scene. Probably no example is more famous than the "Play it again Sam" scene in Casablanca. The music sets the mood and reflects the conflict between Greta Garbo and Humphrey Bogart.

(4.) Non-diagetic sound is heard by the audience, but not the actors.

(a.) Voice over is usually done by a narrator, who is a main character telling a story about the past. The Waltons is a good example as John Boy told us his childhood story and then used voice over at the end to explain the lessons he learned.

(b.) Laugh track is an audience cue to be amused at what the situation comedy writers think is funny.

(c.) Music can be violins that alert us there is romance, drums that build suspense, or an orchestra playing as the climax builds. The music is a cue to how to interpret the action on the screen.

(4.) Props and setting are cues that tell the audience how to interpret context.

A Winchester rifle belongs to the Western, an Uzi to Miami Vice. Similarly, the setting presents context. Seinfeld would be a different program if it was set in Peoria instead of New York City.

(5.) Attire is a short cut, particularly in television, to informing the viewers about a character.

Bad guys wear black hats has evolved into The Natural putting Barbara Hershey in black since she is the evil woman, Glen Close in white because she is the angel that saves the hero, and Kim Basinger in revealing attire because she is the woman without virtue.

(6.) Make up, like attire, cues us to the virtue of the character.

Julia Roberts not only changes clothing in Pretty Woman, but her makeup becomes much more basic and less colorful as she evolves from hooker into wife.

(7.) Verisimilitude (degree of reality) is the extent to which the producers want the film or video to appear to mirror real life.

Roseanne appears to be right out of someone's living room while the original Star Trek series kept using the same Styrofoam rocks in episode after episode, reminding the audience that this was a television set.

(8.) Body language refers to physical movements that convey information to the audience.

(A.) Proximity usually means that the closer two characters are to each other, the more intimate the relationship.

(B.) Gaze has two conventions associated with it. When two people's eyes meet, a relationship is presumed to exist. Also, we have the camera gaze, usually an erotic voyeur gaze of a body with the camera lens representing the view of the audience.

(C.) Touch indicates the nature of a relationship from a punch in the mouth to the caress.

(9.) Reel time versus real time is the relationship between the time that passes within the film when compared to the amount of time that passes in reality.
Reel time can be a flashback that supposedly occurs mentally in a few seconds to a millennium as in 2001: A Space Odyssey. Real time for most television programs is one-half to one hour.

(10.) Tone of the film refers to the density of the film as in film noire, a film style mostly of the 1950s. Film noire genre motion pictures were over exposed so that the images would appear to be dark, even sinister when projected.

(11.) Motion refers to any movement on the screen.

(A.) Mise-en-scene is often associated with "auteurism" or the concept that the film is the product of its director, reflecting his or her style of production.

(B.) Primary motion is movement of people or objects within the frame.

(C.) Secondary motion is camera movement, such as pans, tilts, dollies, or trucks. A pan is moving the camera head on a stationary tripod on the horizontal plane; a tilt is on the vertical plane. A dolly is moving the camera and tripod on a horizontal plane and a truck on the vertical plane.

(D.) Shot pacing is how much movement occurs within the shot. Action films will usually vary the pace to give the audience a rest.

Raiders of the Lost Ark, for example, opens with lots of action in the opening scene when Indiana Jones finds the temple god statute and then the pace slows when he returns to the college campus.

(12.) Color has become a science in itself with certain colors reportedly inducing tranquility and others purchasing fever.

Frequently color is associated with attire, such as a woman in red is very sexual while a woman in pastilles controls her passions. Men in blue-striped suits are businessmen, while men in brown sweaters are conservative.

(A.) Filters placed either on the light source or on the lens can influence the color balance.

A flesh-toned filter can make a person appear to be warmer and friendlier while blue lights are a convention for evil. Sometimes an actress' eye color will be deepened by having her wear colored contact lens.

(13.) Lighting is a discussion of how light sources are used to create a mood.

(A.) Three-point is the standard lighting for film and television.

Generally, one side of a person's face is lit slightly brighter than the other to create a soft shadow that will present depth when broadcast on a two dimensional screen.

(B.) Backlighting can create several effects: silhouette, background illumination, or background separation.

A silhouette is when the background light is brighter than the foreground light, creating a silhouette. A halo effect can be created if the backlight can shine through the object being lit. Glen Close in The Natural stands up at the ballpark as the setting sun backlights her. The light penetrates through her hat, hair, and the edges of the material of her dress, giving her an unearthly sense. This fits with her role as the angel that saves the hero.

Background illumination can create a deep background, making possible z axis shooting and deep focus. Background separation allows the talent or an object to be separated from a background that would make it difficult to clearly see the subject. Talent on camera usually have a backlight falling across head and shoulders to create separation.

(c.) Chiaroscuro is strong, one-directional lighting, creating deep shadows with heavy contrast against the lit areas.

(D.) Flat lighting means turning on enough lights to essentially eliminate shadows. Usually done on game shows and news broadcasts since this "neutral" lighting does not provide for subjective interpretation.

(14.) Point of View is the perspective from which the story is told.

(A.) First person is a story told from the I/We position.

(B.) Third person is a story told from an objective point of view.

(C.) Omniscient viewpoint is from the camera's and audience's viewpoint. We know things going on that the characters don't. We know there's a bomb in the room, but they don't.

(15.) Speed of recording is the speed at which the film or video is shot and/or played.

(A.) Slow motion gives the audience the opportunity to focus on an event and see how it was accomplished.

(B.) Rapid motion is undercranking the camera or putting fewer frames through the camera per second (8 fps for example) than is normal (24 fps). Used in silent movies, particularly Buster Keaton and Keystone Cops films. The characters appear to be running everywhere.

(16.) Animation is creating motion through object movement.

Traditionally, a stationary camera shot two frames of a drawing. Slight drawing changes were made and then the revised drawing was shot. The slight differences would appear to be "movement" when projected on the screen.

Today, movement is accomplished by writing a computer program that moves objects and then recording it. Films like Star Wars and Star Trek are made by moving space ship models one frame at a time. Animation gives film and video creators the opportunity to think about their craft in new ways, permitting the filming of that which can't be done in reality.


Most of us are experts are decoding the array of visual communication presented to us on television and film. The average American watches TV seven hours a day. If you are average, by the time you reach 20, you have watched 51,100 hours of television. Either you spend a lot of time being confused by what you are watching or you have learned how to decode visual signifiers. In turn, those signifiers are part of the way that you think. They are your signifieds arranged in your mental schemas.

Since each of these potential signifying languages reflect an arbitrary choice, and in this context ignorance or ignoring conventions is a choice, film and television offer plenty of opportunity to study the encoding process and its ideological signification by deconstructing the visual texts along these lines of arbitrary selection. Since television and film are primary presenters of information in our culture, the presentation of visual images has tremendous potential influence on our culture. In the next chapter, we will see how these signifiers can be used to encode a text with ideology, which may be the most powerful influence of mass media on our society.