BAILII is celebrating 24 years of free online access to the law! Would you consider making a contribution?
No donation is too small. If every visitor before 31 December gives just £1, it will have a significant impact on BAILII's ability to continue providing free access to the law.
Thank you very much for your support!
[Home] [Databases] [World Law] [Multidatabase Search] [Help] [Feedback] | ||
England and Wales High Court (Patents Court) Decisions |
||
You are here: BAILII >> Databases >> England and Wales High Court (Patents Court) Decisions >> Promptu Systems Corporation v Sky UK Ltd & Ors [2021] EWHC 2021 (Pat) (19 July 2021) URL: http://www.bailii.org/ew/cases/EWHC/Patents/2021/2021.html Cite as: [2021] EWHC 2021 (Pat) |
[New search] [Printable PDF version] [Help]
Neutral Citation Number: [2021] EWHC 2021 (Pat)
Case No: HP-2020-000013
IN THE HIGH COURT OF JUSTICE
BUSINESS AND PROPERTY COURTS OF ENGLAND AND WALES
INTELLECTUAL PROPERTY LIST (ChD)
PATENTS COURT
Rolls Building
Fetter Lane
London, EC4A 1NL
19 July 2021
Before :
MR JUSTICE MEADE
- - - - - - - - - - - - - - - - - - - - -
Between :
|
PROMPTU SYSTEMS CORPORATION |
Claimant |
|
- and - |
|
|
(1) SKY UK LIMITED (2) SKY IN-HOME SERVICE LIMITED (3) SKY SUBSCRIBERS LIMITED (4) SKY LIMITED (5) COMCAST CABLE COMMUNICATIONS LLC |
Defendants |
- - - - - - - - - - - - - - - - - - - - -
Hugo Cuddigan QC and David Ivison (instructed by Powell Gilbert LLP) for the Claimant
Lindsay Lane QC and Maxwell Keay (instructed by Gowling WLG (UK) LLP) for the Defendants
Hearing dates: 11 and 14-16 June 2021
- - - - - - - - - - - - - - - - - - - - -
Covid-19 Protocol: This Judgment was handed down remotely by circulation to the parties’ representatives by email and release to Bailii. The date for hand-down is deemed to be 19 July 2021.
Agreed common general knowledge
TV systems at the priority date
Over the air ("OTA") TV systems
Components and infrastructure used in networks and TV systems
The TV, the STB and the remote control
Access to the Internet via Telco networks
Upstream data via cable networks
Communication and data transfer
"Layers" in networks and software
EPGs for Video on Demand “VoD”
Automatic speech recognition (“ASR”)
Components and infrastructure used in ASR systems
Disputed common general knowledge
Issues of Claim interpretation
Mr Justice Meade:
1. In this action the Claimant (“Promptu”) alleges that the Defendants (together, “Sky”) have infringed European Patent (UK) No. 1,290,889 (“the Patent”), whose priority date is 8 June 2000 (“the Priority Date”). Sky denies infringement, alleges the Patent is invalid, and counterclaims for revocation. The alleged infringement relates to Sky’s Sky Q subscription television service.
2. As indicated, I am going to refer to all the Defendants as “Sky” since there is no reason to distinguish between them, but I should mention that the claim against the Fourth Defendant was dropped at the start of the trial.
3. The trial was conducted in court, with limited in-person attendance owing to the pandemic. All the oral evidence was live; a feed was allowed for persons approved by me who could not fit into the courtroom. I am grateful for the IT support provided as arranged by the parties.
4. Hugo Cuddigan QC appeared for Promptu with David Ivison and Lindsay Lane QC appeared for Sky with Maxwell Keay.
5. Promptu applied to amend the Patent. This went through a number of stages. Shortly before trial, Promptu said that it did not oppose a finding that all the claims of the Patent as proposed to be amended, apart from proposed amended claim 13, were invalid over the prior art.
6. Following this, Sky dropped two out of three of its prior art citations. The result of this was that the only prior art attack I had to decide was obviousness over the citation referred to as Houser (as defined below). The meaning and consequences of this concession by Promptu became a matter of debate, as I explain below.
7. The application to amend down to proposed amended claim 13 was unconditional.
8. The issues were:
i) Some modest issues over common general knowledge (“CGK”);
ii) Two issues of claim construction;
iii) Infringement. There was no issue of fact about how the Sky Q system worked, but there were issues about whether:
a) The operation of the system satisfied the requirements of the claim (which is a method claim) on its proper interpretation;
b) Even if it did, whether it was taken outside infringement by virtue of the fact that some of the elements of the claim take place outside the UK;
iv) Obviousness over United States Patent No, 5,774,859 (“Houser”);
v) Added matter. There were two points. One arose from granted claim 1 and the other from the proposed amendment to claim 1. So formally the second was an objection to amendment rather than an allegation that the Patent was invalid as it stood, but since the application to amend was unconditional it would have had the effect, if made out, that the Patent would be revoked. I have therefore dealt with both points together under the heading of validity, which is also convenient because much the same considerations apply to both.
9. The parties each called one expert to give oral evidence. No oral evidence was required on the PPD because it was not disputed and other witness statements about the Fourth Defendant became irrelevant when the claim against it was dropped.
10. Promptu’s expert was Dr David Greaves, a Senior Lecturer in Computing Science at Cambridge University, who had practical experience in various ways, including in particular as network architect for the Cambridge Interactive Television Trial (“CITV”) in 1993/4.
11. Sky’s expert was Dr David Robinson who has a PhD from Imperial College and extensive practical experience.
12. Each side criticised the other’s expert, not in terms of their independence or honesty, but in terms of their experience and how they corresponded to the notional skilled person. So it was said of Dr Greaves that he had too little experience and that it was from some time before the priority date, and of Dr Robinson that he had too much experience, and was inventive.
13. I did not find these criticisms helpful and they missed the mark. The relevant question is not how closely an expert corresponds in fact to the notional skilled addressee but whether they are able to put themselves in the position of the notional skilled addressee and assist the Court accordingly. I thought both experts were well able to do that. In any event, I thought that Dr Greaves’ work on CITV was very pertinent, with his more general experience being relevant and continuing up to and after the priority date, while Dr Robinson, although no doubt personally inventive, was able to put himself in the position of an uninventive and ordinary worker. It was suggested that he lost sight of this when, for example, he agreed that the notional skilled addressee was a “decent” engineer who could solve problems. Neither of these is inconsistent with a lack of inventive capacity. Notional skilled addressees can solve at least routine problems, but not where it involves invention. They are also “decent” at their jobs, in the sense of diligent and careful.
14. Promptu also criticised the way in which Dr Robinson was instructed. It alleged that the necessary “sequential unmasking” was not done, and that Dr Robinson was directed by Sky’s solicitors to the key part of Houser.
15. There was some modest force in this in relation to proposed amended claim 1, where one of the steps from Houser to the alleged invention involved focus on a particular part of Houser. However, as to this:
i) It is sometimes necessary for purely practical reasons to ask a witness to look at a particular part of a prior art citation, otherwise if they are asked to give all their thoughts about many different passages the task is too big and too diffuse. If it presents a risk of hindsight the Court may have to take it into account, but it is not necessarily fatal or even serious, especially if, as in this case, the witness acknowledges the pointer and gives evidence about why that part of the prior art would be of interest (which as it turned out was common ground at trial). A similar issue arises when experts are given, as they often have to be, guidance about what aspects of CGK to explain.
ii) Once it was conceded that proposed amended claim 1 was obvious, the point lost any real force. I return to this below.
16. Once the first round of expert evidence was in, and the experts knew the claims in issue, sequential unmasking ceased to be relevant. However, Promptu nonetheless submitted that Dr Robinson’s second and third reports exhibited hindsight because he was then looking, with knowledge of the proposed amended claims, for arguments in relation to minimising latency (relevant to proposed amended claim 13). I did feel that Dr Robinson overreached in his third report, in particular in relation to a document from DAVIC, and I have taken this into account, but it was a minor factor given the extent of agreement over CGK, and allowance must be made for the facts that sequential unmasking becomes impossible after a point, and that where a patentee maintains (too) many dependent claims, it is inevitable that not all can be covered in the fullest depth in a first report.
17. There was extensive agreement about the CGK, reflected in a document which, following my request at the PTR, was prepared by the parties. I have edited it down somewhat; some of it is of only modest direct relevance to the issues following Promptu’s abandonment of all claims except proposed amended claim 13, but I have left it in because it is useful to understand some of the written and oral evidence.
18. At the Priority Date, TV companies provided TV services by terrestrial broadcast, cable and satellite. Terrestrial transmission of television was the predominant form of television network in the UK.
19. A traditional TV broadcast system was the OTA, or terrestrial, TV system, where the transmissions of the TV signal were always broadcast OTA. Terrestrial television involved a one-way wireless broadcast of TV signals from a transmitter to the end user’s TV. The operator transmitted a number of different modulated signals OTA on a number of different radio frequencies ("RF").
20. A user's TV was equipped with an aerial or antenna, a tuner and a demodulator. The antenna received the transmitted modulated signals. The tuner was used to select (tune to) a specific RF channel. The information on that RF channel, for example a TV signal, was then demodulated and displayed on the TV.
21. The RF spectrum available for OTA limited the available channels to the consumer and hence limited the number of broadcast channels available.
22. Television delivery systems were largely analogue at the Priority Date. Digital television had been introduced in the years running up to the Priority Date and was expected to take over from analogue television entirely in time.
23. Existing analogue televisions required an add-on digital receiver which converted the received signal back into an analogue signal. The new digital system was designed to accommodate an increased number of channels.
24. In the UK, the Digital Terrestrial Television (DTT) system offered channels from a number of major companies including the BBC and Sky. DTT is a broadcast system, with the previously analogue content being transmitted over the air in digital form and with further digital forward channels replacing data previously provided in the vertical blanking interval. It included a conditional access system allowing viewing of subscription channels.
25. In satellite TV systems, the TV signal was transmitted from a "studio", to an uplink facility, from where the signal was relayed via a communications satellite in the sky to a satellite dish at a user's home. The signal was then transferred to a home's STB, to tune, demodulate, decrypt (if necessary), and decompress the incoming signal.
26. The satellite TV service itself was broadcast only. At the Priority Date, in order to enable interactive services, the STB would connect to the operator's headend by dialling up over the Public Switched Telephone Network ("PSTN") when required in order to get Pay Per View ("PPV") services, for example.
27. Traditional cable television (coaxial cable to the home): A coaxial cable is an electrical cable, consisting of an inner core wire and a surrounding shield. TV signals were transmitted over it on a wide RF spectrum (typically from around 50MHz up to approximately 800MHz). The RF spectrum was divided into smaller frequency bands, one per individual TV channel. The transmission of TV signals on coaxial cable used the same concept of modulation as OTA transmissions. However, rather than modulating waveforms for transmission over RF in the air, early cable systems modulated the OTA signals on to the cable TV system. Users could tune between TV channels delivered on a coaxial cable, just as if those channels were received on a local antenna - although a STB was additionally required for this function, since the received signal would need to be re-modulated before being sent to the TV. The TV would then tune to and demodulate the signal. The bandwidth allocated for each channel was 6MHz in the US and 8MHz in the UK.
28. Coaxial cable was capable of broadcasting many hundreds of TV channels, far more than could be done using terrestrial broadcasting. This enabled a wider range of programming to be offered on cable TV.
29. A small amount of the bandwidth (the lower part of the RF spectrum, at approximately 5-40MHz) on a coaxial cable system was reserved for upstream communications from the user to the provider, which enabled some interactivity. However, for this the network additionally needed bi-directional amplifiers in the local infrastructure and a modem to generate the upstream signal.
30. The RF spectrum below 50MHz was also used for voice telephony over cable. The exact frequencies used changed over time and depended on the provider.
31. Hybrid Fibre Coax (HFC): By the Priority Date many coaxial cable systems were upgraded by replacing the coaxial cable in the infrastructure in the core of the network with optical fibre cables. The coaxial cable was retained for the network at the "edge", i.e. close to the home. This was known as HFC.
32. Optical fibre cables have the advantage of lower signal losses and greater capacity compared to coaxial cable meaning cost could be saved in the network by reducing the number of components such as amplifiers.
33. The TV was a universally available medium to display the TV content and any interactive services to the user.
34. Televisions could connect to a STB configured to provide a communications path between the STB and the headend, for system control and management data flows (such as PPV). On a cable TV network, as explained, this two-way communication was achieved via a modem which may have been embedded in the STB or may have been a discrete unit. Initially the modems were dial-up modems and used telephony. Other modems simply tuned to particular frequencies for upstream and downstream data traffic, with the frequencies being shared between multiple users. By the Priority Date cable modems were also in use. They provided faster data rates compared with first-generation STB upstream modulators and provided good access to the public Internet.
35. The STB would be controlled using a remote control. The remote control would be used to navigate the TV and interactive services. Some TVs would have displayed these functions using EPGs (also referred to as IPGs), which are explained further below.
36. The costs of the STB and its remote control were a major consideration for TV deployments. Even a small increase in costs of these components, could have a large economic effect. At the Priority Date, there was a debate in the industry as to where functionality should be put, either in the user's home (in the STB/remote control) or at the headend. Which location was chosen would depend on numerous factors such as, for example, the functionality in question, costs and ease of implementation (including control and updates). This was an engineering and financial trade off.
37. The Cable TV system headend was run by the operator, which collected the TV content from content providers (via e.g. OTA, satellite or cable transmission) and processed and distributed the TV content to the users. The headend was responsible for e.g. converting content from analogue to digital format, applying content protection (for high value channels and PPV titles for example) and inserting adverts into the TV broadcast.
38. Co-axial cable TV network infrastructure: the coaxial cable TV network was a hierarchical network, where the headend transmitted the TV signals modulated onto RF, over coaxial cables to the customers' houses. The transmission was in Frequency Division Multiplexing form. The cable TV network included amplifiers to compensate for signal loss over the significant distance the signal travelled, and splitters and taps to enable the signal to reach each user. The initial coaxial cable from the headend was known as the 'super-trunk'. The super-trunk branched out as the coaxial cable progressed through the network, with trunk splitters (which divided the input TV signal) positioned along the coaxial cable path, dividing the residential area (or neighbourhoods) into segments.
39. The final connection to a user's STB was made via drop cable at a tap, which was a cable of up to 30 metres that led off the main coaxial cable, to the user's home.
40. In order to use the upstream capacity, the amplifiers had to be upgraded to bi-directional amplifiers. The coaxial cable was a shared physical medium which meant that the upstream bandwidth capacity on a coaxial cable TV network would usually be shared between a large number of customers, the actual number depending on the size and use of the network.
41. HFC network infrastructure: The introduction of optical fibre increased capacity and, by reducing the number of amplifiers and other components, had the added benefit of decreasing noise and distortion in the system. In HFC the cable TV network was split into sub areas, each served by a coaxial cable network. Thus the infrastructure providing the final connection to the residential customer was not changed.
42. The "optical to electrical distribution node" was the point where the optical signal was converted back to an electrical signal and sent over the coaxial cable that branched into the residential customer serving areas also referred to as the distribution network or "last mile".
43. The existing coaxial cable connection from the headend to the customer was upgraded so that the links from the headend to the nodes were replaced by two-way optical fibres, traditional coaxial cable remained in place between the node and the customer's home.
44. Cable networks were typically referred to as tree and branch structures. Depending on the size of the cable network, a cable network may consist of multiple headends. This would not change the basic tree and branch structure though.
45. The figure below shows how the RF spectrum was typically allocated in the mid-90s, (the exact frequency ranges would vary from operator to operator). The lower frequency (here 5 to 40MHz) would be reserved for the upstream transmissions, with the higher frequencies being reserved for the analogue and digital channels.
46. The figure below shows how the RF spectrum was allocated with the use of DOCSIS at the Priority Date, which used the 5 to 42 MHz frequency spectrum for the upstream and selected channels within the higher frequency spectrum for the downstream transmission.
47. Technologies using both frequency allocations would have been in use at the Priority Date.
48. Telco network infrastructure: In the Telco network infrastructure, a direct line led from the user's home to the local telephone exchange. This line may have stretched up to three miles from the user's home. The ADSL capacity between the exchange and the user's home varied with distance (the further the user was from the local telephone exchange, the lower the bandwidth). The bandwidth was also dependent on the quality of the line from the home to the exchange. At the local telephone exchange the data was extracted and combined with multiple lines from multiple users. The aggregate data was then sent over the operator's internal network, often consisting of an optical fibre network, using various Telco transmission protocols.
49. A home may have alternatively been connected to a node located between the central office / local exchange and the home. The closer the home was to the exchange or node, the higher the bandwidth capacity that was available to the user.
50. Dial-up modems: By the Priority Date, a large proportion of households had dial-up internet access. Dial-up modems used the existing analogue voice service to send data. The customer’s modem made a telephone call to the phone number of the internet service provider (ISP). The ISPs phonelines were, in turn, connected to modems which would answer the incoming calls. Each modem transmitted a modulated electrical signal representing the data it received from the local computer it was connected to, while receiving the equivalent signal from the remote modem and converting it back into a data signal which it sent to the local computer. The data was modulated onto the voice channel of the PSTN. It allowed access to the public Internet. The bandwidth available on a standard telephone line (designed just for voice) was limited and the data rates achieved therefore low (for example, by 2000 a good line could be expected to achieve 56 kbps).
51. Integrated Services Digital Network: “ISDN" split the frequency used for standard telephone calls into multiple digital channels. These channels operated concurrently and independently allowing multiple simultaneous conversations or data interactions over the same physical line (i.e. one could make phone calls, access the Internet for web browsing, and transmit data all, at the same time, on the twisted copper pair line).
52. ISDN was a symmetric method (same capacity upstream as downstream). At the Priority Date some operators (e.g. BT Home Highway) were offering a data rate of 128kbps to residential customers. The ISDN Internet service was accessed by connecting an ISDN modem to the twisted copper pair.
53. ADSL: ADSL technology offered consumers faster Internet access over the telephone network than dial-up modems or ISDN. In ADSL, more bandwidth was made available on the Telco network to allow access to upstream and downstream data transfer.
54. In the home, PCs and STBs connected to the ADSL router using an Ethernet connection. A telephone could also be connected to the ADSL router using a frequency splitter to remove the higher frequencies used by ADSL, to prevent these from interfering with the telephone calls.
55. ADSL enabled digital data to be sent over Telco networks at the same time as the lines were used for telephone calls. It was a two-way "full duplex" communication line.
56. Dial-up and single frequency modems: Dial-up modems were also used in STBs. These could use the PSTN for access or the telephony part of the cable network (where provided).
57. Other modems in cable STBs would typically use a single frequency allowing access to Internet as well as cable TV. For example, the Motorola DCT-2000 STB had an upstream modem option (STARVUE II (RF)). STARVUE would tune into a particular frequency in the upstream part of the cable spectrum. It is likely that many other STBs in the same cable segment would be using the same upstream frequency. When the data rates were low (e.g. occasional request of PPV purchase) a simple contention based protocol such as Aloha was sufficient. As the need for higher and more sustained data rates became necessary (e.g. access to the Internet), more efficient use of the upstream spectrum was necessary.
58. Cable modems: Proprietary cable modems had existed for some time prior to the Priority Date, however the technological approach to this was standardised under the Data Over Cable Service Internet Specification ("DOCSIS") technical standards. This enabled standards-based interoperability whereby "certified" cable modems from multiple vendors work with "qualified" Cable Modem Termination Systems ("CMTSs") from multiple vendors. The DOCSIS standard which originated in the US was amended for Europe, to take account of the regional variation in radio spectrum for a TV channel (8 MHz for Europe, 6 MHz for the US).
59. DOCSIS enabled high-bandwidth data transfer over coaxial cable systems and enabled two-way "full-duplex" digital communications over a cable modem (with regional variations). DOCSIS handled data traffic as IP packets using the IP Protocol.
60. Internet access through cable modems was faster than through dial-up modems and like ISDN had the additional advantage of not tying up the consumer's telephone line.
61. At the Priority Date, data could be sent in either analogue or digital form. An analogue signal is a signal that represents a physical quantity which had a continuously varying value. This could be sound waves of a spoken voice, the electrical voltage or physical movement of the needle on a vinyl record.
62. A digital signal, on the other hand, was a sequence of discrete finite values - e.g. a string of '0's and '1's. An analogue signal could be converted into a digital signal by periodically sampling the analogue signal and giving a digital representation of each sample.
63. Digital signals were less susceptible to noise (at any moment a signal is either a 0 or 1) and enabled use of error correction and retransmission. Digital signals could be compressed to reduce the capacity they took up on a transmission medium, and the resulting digital stream used to modulate carrier signals for transmission (e.g., on a wire or over the air).
64. Modulation is the process in which an input signal, such as that representing the TV content, is modulated onto a sinusoidal carrier signal, resulting in the modulated signal.
65. From the early days of radio, different radio channels would be modulated onto distinct carrier frequencies. The receiver could tune to the desired carrier signal, demodulate it, and recover the original program without interference from other streams. Hence, multiple concurrent radio channels would not interfere with each other provided the receiver was tuned to select only the desired carrier. This was how radio and later analogue TV worked allowing many radio and TV stations to operate in the same geography.
66. By the Priority Date the transmission of data in telecoms systems had primarily moved from circuit-switched to packet-switched communications. In the early 1990s, well-established non-broadcast services such as telephone calls, would be circuit switched - once set up, a call had a dedicated digital or analogue bandwidth available until the call was cleared down. A dedicated connection was established between A and B and the data travelled on that connection. In packet-based transmission, networks moved data in separate, small blocks -- packets -- based on the destination address in each packet. When received, packets were reassembled in the proper sequence to make up the message. This allowed greater efficiencies to be obtained when conveying traffic that does not have a constant data rate. No single resource needs to be maintained until the end of the communication. Instead bandwidth is dynamically shared between several users. The packet generally contained a header (including address information) and a payload (the actual data to be sent).
67. The OSI 7 layer stack was a well-known reference model, which helped explain at a high level how devices in a network communicated. The functionality of the network devices was essentially divided up into a vertical stack that consisted of seven layers, whereby each layer performs certain tasks and then passes data on to the next layer. The point was to divide the flow of data through the network into independent but interoperable layers. Each layer served some functionality to the layer above it and was served by the layer under it. This meant that corresponding layers in different places could communicate with each other without having to take account of the complexities of higher and lower layers.
68. An informal representation of it is shown below:
69. The physical layer (Layer 1) was responsible for the physical characteristics of a single link (hop). This covered how data was modulated onto the physical medium as well as the dimensions of the connectors etc.
70. The Datalink layer (Layer 2) was responsible for transmitting chunks of data across a link. This included coordinating access to shared medium as well as providing addresses providing basic error detection and correction.
71. The network layer (Layer 3) was responsible for the transmission of packetized data from one host to another over multiple hops. Each packet was identified by a sender and destination address to facilitate the routing through the network. An example of applying sender and destination addresses was the IP Protocol. However, other protocols would have been available for this purpose. The Skilled Person would have known that in order to send information between two devices, which are located remote from each other, such as in a client/server system, appropriate communication protocols to apply source and destination addresses needed to be deployed. Additional forms of identifying the source could be included at the application layer (Layer 7).
72. The transport layer (Layer 4) was responsible for the end-to-end delivery of complete messages or segments. It dealt with errors that the network layer could introduce, such as lost packets, reordered packets, duplicated packets as well as fragmentation and reassembly. A transport layer could provide:
i) a connection oriented service - an example is the TCP Protocol;
ii) a connectionless service - an example is the UDP Protocol.
73. The Upper Layers (session, presentation and application), were often implemented in software and ran in the user spaces of the operating system.
74. Structuring communications in this manner, using industry-standard interfaces between the layers, meant that the upper layers did not have to consider the means of physical transmission. Therefore, applications that worked in one network environment would automatically work in another (they can be "network agnostic") and lower levels did not have to be customised to particular applications that made use of their capabilities.
75. In general terms, multiplexing was a process of combining multiple signals which were being sent between two or more devices. Multiplexing could happen in various layers including but not limited to the physical, datalink, network and application layers.
76. At the recipient end, the different communication flows were demultiplexed using the source and destination addresses for packet data.
77. At the Priority Date, the Skilled Person would have been aware of different forms of multiplexing on the physical layer:
i) Frequency Division Multiplexing ("FDM"): The overall bandwidth in a given system was divided into a series of non-overlapping frequency bands. Each non-overlapping band carried a separate signal. FDM was used for example in radio and TV broadcasting and in cable TV to transmit the TV channels;
ii) Time Division Multiplexing ("TDM"): In TDM, data, such as in the form of packets were assigned fixed time slots on a given transport link, e.g. a frequency channel. That way more than one user could use the same frequency channel and each user was provided with a different time slot to transmit its data. This avoided "contention" on the channel so data sent from more than one user does not compete for the same resource;
iii) Wavelength Division Multiplexing ("WDM"): WDM is essentially the same process as FDM as applied to optical fibre cables. Multiple optical carrier signals are multiplexed onto a single optical fibre using different wavelengths.
78. The Aloha transmission protocol (operating over the physical/data-link layers) was a multiple access protocol for transmission of data over a shared network channel such as a cable uplink. It was a so-called "contention" based protocol in which the sending device did not check whether the channel was idle or busy before starting to transmit.
79. TCP/IP was a set of standards and procedures specifying how packetized data was transmitted between network devices, detailing how the communication was broken into variable sized packets, addressed, transmitted, routed and received at the destination. These standards were central to supporting communications across the Internet.
80. IP (Internet Protocol) was a network protocol where data was sent as individual packets (typically up to 1500 bytes). Each packet had a source and destination address which network equipment used to route the packet from source to destination.
81. The TCP/IP conceptual stack consists of 4 standard layers but these are readily mapped to the 7-layer reference model as shown in the diagram below:
82. TCP (Transmission Control Protocol) was a connection-oriented transport layer protocol which usually ran over IP. It provided a reliable, flow controlled, and ordered stream of bytes between source and destination. The stream would typically consist of multiple IP packets and may be long lasting. Prior to sending any data, a connection was first established between the source and destination.
83. TCP/IP was well established at the Priority Date to send data between users and providers in a network.
84. UDP (User Datagram Protocol) was a connectionless transport layer protocol using IP. In UDP data packets were addressed and routed individually without any flow control or error correction (only error detection), i.e. if packets were lost, they would not be re-sent.
85. TCP/IP was most commonly used in modern computer networks, in particular those making up the Internet.
86. Initially cable TV STBs could only receive information broadcast from the headend such as TV programs or listing data about these programs. This was sufficient for broadcast TV. When the number of TV channels started to grow, viewers needed a way of navigating what was available. This led to the development of EPGs.
87. By the Priority Date, EPGs allowed a user to navigate through available options using a remote control. In certain programming guides the viewer could also see previews or view additional information about programs that were airing or scheduled to air.
88. VoD could involve access to a large catalogue of videos. By the Priority Date, there were video servers capable of storing numerous videos which was of little value if the end user was not able to conveniently navigate the titles and quickly find one which satisfies. A long text list of titles was not sufficient.
89. One way to aid the navigation was to use richer EPG metadata. Box art had long been a way to catch the eye. Even today it is common for VoD systems to display a matrix of box art with title. When the title was selected, additional details could be shown including a detailed description of the video, list of the cast, producer/year of production etc.
90. If VoD playback of content by the server could be controlled from the STB by the user, this allowed full VCR control including play, pause, fast forward, rewind and stop. In addition to having the relevant buttons on the remote control, an on-screen menu could pop-up with the familiar icons and the user could use left/right/select to control the playout.
91. At the Priority Date, ASR was a developing and active field of research and product development and had been for most of the preceding decade.
92. As well as "off-the-shelf" solutions for certain product types, there were a number of well-known technology solution providers. The leading groups on the industry-side included (among others) Philips, IBM, Microsoft, Dragon Systems, Lernout & Hauspie, AT&T and Panasonic.
93. At the Priority Date, the Skilled Person would have understood the basic architecture of an ASR system and that it may comprise a number of steps.
94. The ASR system could in principle be distributed in the overall system architecture, as long as the respective input and output information was transmitted between the modules and as long as the integrity of the content of the information (i.e. the data included in the information as well as the time-sequence of the data elements) was safeguarded.
95. When deploying ASR at the Priority Date, the components and infrastructure used would be determined by the application in which the ASR was being implemented and the performance required by that application. The range of factors to be considered when designing a system included the size of the vocabulary, whether it was necessary to update the vocabulary (and if so how, and how often), the speed of processing / latency (time between speech utterance and conversion to text) and accuracy of the recognition. If the latency was too high or accuracy too low, people would be deterred from using the application supported by the ASR system. Depending on the application the Skilled Person would also need to consider whether the system should be speaker-dependent (requiring training) or independent, and whether it would need to deal with continuous speech, or just short commands or phrases. For continuous speech more sophisticated language and grammar models would be necessary. The Skilled Person would also understand that they would need to have a certain amount of computing resources, i.e. processing power and memory, in order for the system to function effectively.
96. The Skilled Person would also have needed to consider where to place the means to capture the user's spoken requests (i.e. the microphone). If possible, placing microphones close to users’ mouths would be beneficial for clearly receiving the spoken utterances, while reducing extraneous sounds such as background noise. There could have been a stand-alone microphone or a microphone built into another device, and these could be wired or wireless. The separation of intended utterances and background noise could have been further enhanced by using a microphone with cardioid or other selective reception pattern to reject undesired noises. It would also have been necessary to consider user acceptance in this context.
97. Other factors would have needed to have been considered when deciding where to locate, and how to operate, the microphone. The first is power supply and, in the case of a handheld device powered by batteries, battery life. If the microphone is live all the time, then the battery will be drained very quickly powering the circuitry for receiving and pre-processing the speech utterances. One obvious solution when including the microphone in a battery operated device would have been to include a push-to-talk button (such as commonly found in walkie-talkie mobile handsets or on digital dictation machines) where the microphone and related circuitry is only live when pressed. This also brings another obvious benefit that the system is only picking up utterances when the user actually intends it to, i.e. when the button is pressed, and so it does not pick up general conversation and the like. Push-to-talk was of course an old concept going back decades referring to things like two-way radios where you would press a button in order to talk and transmit voice, rather than being on receive mode (button not pressed). By the Priority Date it was a commonly used term to refer to any device where you would press a button in order to transmit or capture voice.
98. A known alternative to the microphone being close to users’ mouths would have been to put the microphone in the receiving station such as a PC or STB for example, or to have it as a standalone microphone in a fixed point in the room such as hanging from the ceiling. Being relatively distant from the speaker giving utterances, the microphone would have to be less directional and would be prone to picking up unwanted sounds and noise. This makes the task of good speech recognition harder as the signal to noise ratio will be worse. Given the issue of other noise/sound pick-up from a microphone in the device across the room, the user experience can suffer - the user would have needed to make sure they were close enough to the device and that no one else was talking. You would again have had the issue of trying to only pick up and process speech intended for the device and so there would have had to be some kind of activation key word such as 'ATTENTION' for example that could have been used to activate the microphone to pick up the voice for speech recognition. The microphone would always need to be powered up, of course, so that it can listen out for the activation word.
99. It was well known that latency was an issue to be avoided particularly for user acceptance and so steps were routinely taken to improve this.
100. There were no head-on conflicts about CGK following the oral evidence, but there was one area where there was perhaps a difference of emphasis, or as to the detail that was CGK, and that was in relation to latency in ASR.
101. As is set out above, it was CGK that latency could be a problem. In cross-examination, Dr Greaves was taken to a 1998 DAVIC specification which contained a latency “budget” for a movies-on-demand application running on an HFC network. It provides for a time of 400ms from starting a selected movie from a stopped situation. The budget is split so as to include separate upstream and downstream components.
102. It was not contended that the specific budgeted times set out were CGK and it is clear that they were just illustrative. But Dr Greaves accepted, and I find, that the sorts of matters for which a budget had to cater, and the general level of times that were covered (down to about 10ms) were CGK, and that it was CGK that operations of this general kind had to be perceived by the user as being implemented more or less instantaneously. It was also CGK that setting up an upstream connection would be recognised as a task whose contribution to latency had to be taken into account.
103. It is relevant to the infringement issues to know what a Web Socket is. This was not addressed as part of the CGK, I assume because it came later than the priority date. The following explanation is taken from the evidence of Dr Greaves and I believe is not contentious (it involves a fuller description of TCP than given in the main CGK section above):
“314. …. WebSockets use TCP as the underlying transport mechanism, and WebSockets are HTTP-compatible in the sense that control messaging to establish a WebSocket uses HTTP. …. . The WebSocket provides a standardised way for the server to send content to the client without being first requested by the client for each interaction, and allowing messages to be passed back and forth while keeping the connection open. In this way, a two-way ongoing conversation can take place between the client and the server. The communications are usually done over TCP port number 443 (or 80 in the case of unsecured connections).
…
317. A TCP connection is a logical channel made up of a stream of bytes that are sent between the two processes in either direction. The data is sent between the processes as TCP packets. Each TCP packet contains 4 separate fields that together uniquely identify the TCP/IP connection to which the packet is associated. These identifiers are: source IP address, source port number, destination address and destination port number. Port numbers on a server computer act like flat numbers on an apartment block, providing a means for routing incoming material. For example, the “telnet” application usually listens to port 23, and any incoming connection into port 23 will be directed to the telnet application. In the case of the WebSocket connection, the server’s port number is usually 443 or 80.
318. The WebSocket protocol (which is managed by the application code) adds a framing structure on top of the streaming nature of TCP, and provides, when used with SSL/TLS, a more secure connection establishment mechanism, as well as constructing a logical channel (or channels) on top of the TCP connection.
319. Data communicated over the WebSocket is formatted using the WebSocket framing mechanism which further partitions the TCP connection so that it can convey a mixture of traffic. The mixed traffic can be using a variety of protocols, including TCP itself, leading to a situation where a TCP connection is running over another TCP connection.
320. WebSockets persist until they are shut down, and their function is to maintain a communication channel.”
104. In short, and at least in the context of the alleged infringement, WebSockets can be thought of as an enhancement on top of TCP, in which the packets sent contain IP address and port information for the source and destination.
105. As I have said, the Patent has a priority date of 8 June 2000. It is entitled “System and Method of Voice Recognition Near a Wireline Node of Network Supporting Cable Television and/or Video Delivery”.
106. The specification of the Patent is long (492 paragraphs, 35 figures) and much of it is not relevant to what I have to decide. Further, most of the parts to which the parties went in evidence and argument are best introduced and explained in the context of the claim interpretation points (and/or in relation to the points on added matter, where equivalent parts of the application for the Patent as filed were material).
107. For present purposes, I can explain the general idea behind the claims in issue by reference to Figure 3, whose narration in the specification starts at [0055]. Figure 3 is as follows:
108. It should be noted that there are some typographical mismatches between the reference numerals in Figure 3, and the text of the specification. For example “tightly coupled server farm” is 3000 in the text but shown as 2000 in the Figure. The errors are obvious and neither side suggested that they hinder comprehension.
109. The key aspects of Figure 3 are that each user has a remote control and a set-top box. The set-top boxes of the many users in the system as a whole all communicate over a network, in particular in Figure 3 via HFC optical nodes, to a central server, referred to in the claims as a “wireline node”.
110. The individual users are able to give speech commands using microphones in their remote controls.
111. The user’s microphones each have a “push-to-talk” button and (at [0060]) it is taught that “The push-to-talk button may begin the process of speech recognition by informing the system that the subscriber is about to speak and also to provide immediate address information. Address information identifies the user site at which the speaking occurs.”
112. Further detail of this is given at [0089] - [0095] (by which stage Figures 4 and 5 have been introduced, although nothing turns on them) where the means of communication between the remote control and the set-top box are discussed, as well as some speech pre-processing in the remote control, and sending the address information ahead of the speech packets, among other things to improve the efficiency of the processing at the wireline node. It is also explained that immediately on pressing the push-to-talk button an icon may be displayed on the user’s TV to show that their input has been taken in and is being acted upon.
113. In those paragraphs, the push-to-talk button is referred to as the “PTT button”. In other contexts in the case the expressions “recognise button”, “press to speak button” or just “talk button” were used. They are interchangeable as used to denote a button pressed by the user when he or she intends to give a voice command, but that does not mean that the functions that occur (or would be obvious to make occur) when they are pressed is always necessarily the same, and I address that where necessary.
114. At the wireline node, speech processing is done centrally for all the users. This can include fetching and using speech parameters specific to individual users (see [0075]). The users’ speech commands are responded to by the wireline node sending back customised entertainment and/or information. One source of information used is a “content engine” in the network, which is essentially a database of what is available.
115. Since only proposed amended claim 13 is now defended, it is convenient to work just from that. The parties provided the following claim breakdown. They used different letters to denote the individual claim features, as indicated by the first and last columns. The number denotes from which claim the feature comes, and the lettering is for reference:
Amended Claim 13
Sky No. |
|
Promptu No. |
1A |
A method of using a back channel containing a multiplicity of identified speech channels from a multiplicity of user sites (1100) |
1a |
1B |
presented to a speech recognition system (3200) at a wireline node (1300) of a network | |
1C |
supporting at least one of cable television delivery and video delivery, comprising the steps of: | |
1D |
receiving said back channel to create a received back channel, |
1b |
1E |
partitioning said received back channel into a multiplicity of received identified speech channels; |
1c |
1F |
processing each of said multiplicity of said received identified speech channels with said speech recognition system to create corresponding recognized speech content for each received identified speech channel; |
1d |
1G |
responding to said recognized speech content to create a recognized speech content response that is unique for each of said received identified speech channels; and |
1e |
1H |
individually controlling the delivery of entertainment and information services to each user site (1100) in accordance with said recognized speech |
1f |
1I |
wherein said network comprises a content engine from which said speech recognition system receives content status information; |
1g |
11A |
wherein each user site contains at least one set top box which is associated with a remote control containing a microphone and a talk button; |
11a |
11B |
wherein the analogue sound signals picked up by the microphone are pre-processed by the remote control; |
11b |
11C |
and wherein the set top box receives a radio frequency (RF) signal from the remote control; | |
13 |
and wherein upon depressing the talk button on the remote control data are sent to the wireline node alerting the system as to the user site and a potential input. |
13a |
116. Hereafter, I am going to use the claim feature labels proposed by Sky (the left column) simply because it breaks down into more detail.
117. Feature 1I was proposed to be added by amendment. Features 1A to 1H were in the claim as granted.
118. As I have already said, shortly before trial Promptu conceded the validity over the prior art of all claims down to and including proposed amended claim 11, but it said that it would defend proposed amended claim 13. Thereafter, Sky narrowed its case down to just Houser, and did not pursue its other pleaded prior art. I was not addressed in detail about the dropping of the other prior art but it seems that there was pragmatic recognition by both sides that Houser was the most relevant art once proposed amended claim 13 was the only remaining target.
119. Following Promptu’s narrowing to proposed amended claim 13, correspondence ensued in the course of which Promptu’s solicitors said Promptu would “assert the inventiveness of amended claim 13 alone”, and that “the only validity issues that remain concern amended claim 13”.
120. Promptu’s opening skeleton, paragraph 48, then said:
“Amended claim 13 is dependent on amended claim 11, itself dependent on amended claim 1. Those two prior claims add the features of (i) a content engine which feeds into the speech recognition system and (ii) a remote control with a microphone and a talk button which pre-processes the user’s speech and communicates with the set top box using radio-frequencies. These features are not relied on as inventive in these proceedings.”
121. During the cross-examination of Dr Robinson a dispute emerged about the effect of this concession, when Counsel for Promptu asked Dr Robinson questions about the steps necessary to get from Houser to proposed amended claim 11. Counsel for Sky objected that those were no longer in play as a result of Promptu’s concession. I directed that the evidence should conclude and that the point could be argued afterwards.
122. When the discussion returned to this point, Counsel for Promptu took the position that although the concession precluded his arguing that getting to proposed amended claim 11 from Houser was inventive, it was nonetheless legitimate for Promptu to rely on the sequence of steps involved as part of a Technograph (Technograph Printed Circuits v. Mills & Rockley [1972] RPC 346 HL) argument, albeit that each was uninventive. I must say that I had not anticipated that that line would be taken, and nor, clearly, had Counsel for Sky, who had not cross-examined on those steps individually.
123. After some discussion, Counsel for Promptu took the fair and pragmatic stance that Promptu would not rely on the steps necessary to get from Houser to proposed amended claim 11, but would maintain that the steps necessary to get from there to proposed amended claim 13 had to be shown by Sky to be obvious in the specific context of Houser; that Sky could not treat proposed amended claim 11 itself, as an abstract collection of features, as being part of the prior art.
124. I think that was right in principle, and was fair. One reason it was fair was that any confusion about the scope of the concession was, in the circumstances, the responsibility of Promptu. In practical terms it meant that the logic for making, in the context of Houser, the further step to proposed amended claim 13 had to be consistent with Sky’s concrete case relating to proposed amended claim 11 as it had been developed in the context of Houser through the evidence of Dr Robinson. Sky always knew that that was going to be the case and there can have been no surprise about it.
125. There was some further discussion about this point, right at the end of the oral argument, in Promptu’s reply, in the context of Pozzoli question 3. I felt that Promptu was trying to retreat from its previous position as identified above, because it contended that the Pozzoli differences included all of the features arising on the claims prior to proposed amended claim 13.
126. However, although Promptu was in this way presenting a somewhat moving target in point of principle, at a concrete level I do not think it made any difference, because it was clear that the only specific point that Promptu sought to make was an alleged inconsistency between the threshold feature in Houser and the implementation of the push-to-talk button, also taught in Houser, that Sky relied on. I am able to deal with this, and do so below.
127. I return to the Pozzoli analysis below.
128. Although it is not necessary, given my analysis, to consider in any detail the potential Technograph steps that Promptu gave up by its concession, it is fair to say that what I heard about them in the course of the cross-examination of Dr Robinson did not sound at all impressive. For example, one of them was the choice to progress figure 15 of Houser, but both experts had agreed that that was positively attractive for specific reasons they also agreed on. So at the end of the day I think that Promptu has been able to make the argument that had potential (the threshold/push-to-talk consistency point), albeit that I have rejected it.
129. The issues of claim interpretation in this case are about the “normal” meaning, not about equivalence. The applicable principles are set out in a number of places in the authorities. I find a convenient and recent one is the judgment of Floyd LJ in Saab Seaeye Limited v Atlas Elektronik [2017] EWCA Civ 2175 at [18] and [19]:
“18. There was no dispute about the principles which apply to the construction of patent claims. Both parties relied, as did the judge, on the summary in this court's judgment in Virgin Atlantic v Premium Aircraft [2010] RPC 8 at [5]:
‘(i) The first overarching principle is that contained in Article 69 of the European Patent Convention.
(ii) Article 69 says that the extent of protection is determined by the claims. It goes on to say that the description and drawings shall be used to interpret the claims. In short the claims are to be construed in context.
(iii) It follows that the claims are to be construed purposively - the inventor's purpose being ascertained from the description and drawings.
(iv) It further follows that the claims must not be construed as if they stood alone - the drawings and description only being used to resolve any ambiguity. Purpose is vital to the construction of claims.
(v) When ascertaining the inventor's purpose, it must be remembered that he may have several purposes depending on the level of generality of his invention. Typically, for instance, an inventor may have one, generally more than one, specific embodiment as well as a generalised concept. But there is no presumption that the patentee necessarily intended the widest possible meaning consistent with his purpose be given to the words that he used: purpose and meaning are different.
(vi) Thus purpose is not the be-all and end-all. One is still at the end of the day concerned with the meaning of the language used. Hence the other extreme of the Protocol - a mere guideline - is also ruled out by Article 69 itself. It is the terms of the claims which delineate the patentee's territory.
(vii) It follows that if the patentee has included what is obviously a deliberate limitation in his claims, it must have a meaning. One cannot disregard obviously intentional elements.
(viii) It also follows that where a patentee has used a word or phrase which, acontextually, might have a particular meaning (narrow or wide) it does not necessarily have that meaning in context.
(ix) It further follows that there is no general 'doctrine of equivalents.'
(x) On the other hand purposive construction can lead to the conclusion that a technically trivial or minor difference between an element of a claim and the corresponding element of the alleged infringement nonetheless falls within the meaning of the element when read purposively. This is not because there is a doctrine of equivalents: it is because that is the fair way to read the claim in context.
(xi) Finally purposive construction leads one to eschew the kind of meticulous verbal analysis which lawyers are too often tempted by their training to indulge.’
19. Sub-paragraph (ix) must now be read in the light of the Supreme Court's judgment in Actavis v Lilly [2017] UKSC 48 , which explains that, at least when considering the scope of protection, there is now a second question, to be asked after the patent claim has been interpreted, which is designed to take account of equivalents. There was some reference in the written arguments to the impact of that decision on the present case. In the end, however, Mr Mellor disclaimed any reliance on any doctrine of equivalence for the purposes of supporting an expansive scope of claim in the context of invalidity. That issue will therefore have to await a case in which we are called upon to decide it.”
130. Counsel for Sky stressed points (ii) and (iii), and submitted that “if the specification is all about one thing, it would not be expected that the claims would be about something different”. What this was leading up to was a submission that because the specification of the Patent was (almost) entirely about traditional, closed cable TV networks, the skilled addressee would think that the claims would not cover anything else. I reject this. The principle that claims are to be construed in the context of the specification does not mean that they are, or would be presumed to be, limited to the preferred embodiments. Usually, the patentee generalises from the preferred embodiments, and if general language is used then it is not normally legitimate to restrict the claims to the preferred embodiments. See Floyd J, as he then was, in Nokia v. IPCom [2009] EWHC 3482 (Pat) at [41].
131. The issues of claim interpretation go to the infringement allegations. In that context, two issues arise:
i) Is claim 13 as proposed to be amended limited to closed TV cable networks? “Closed TV cable networks” was the phrase used by Sky. Its definition was not entirely rigorous, partly because it spanned multiple claim features, but I understood it to include the requirements that the infrastructure be of the typical coaxial cable kind, that it be owned by the cable company, and that FDMA or TDMA be used for multiplexing. If claim 13 is so limited then there is no infringement because the relevant messages in Sky Q are sent over the public internet. This point involves consideration of a number of claim features but also has to be assessed in the context of the claim as a whole. I will call this “the network issue”.
ii) Does claim 13 cover sending data as part of making an initial connection to the wireline node, or is it limited to sending data after a connection has been set up? If the latter then there is no infringement because the data transmission relied on by Promptu is part of the WebSocket’s being set up.
132. There are two general matters to address before coming to the claim wording.
133. The first general matter was that Counsel for Promptu submitted that there was no technical reason why the patentee would want to restrict the claim to closed TV cable networks. He asked Dr Robinson if there was any such reason, and Dr Robinson accepted there was none. Similarly, I asked Counsel for Sky the question and she essentially accepted there was none. This is not decisive, because the patentee might choose to have a narrow claim for non-technical reasons (none is apparent from the face of the Patent, but the patentee might have had a reason of their own), but it is important, in my view.
134. Another way of putting this is that the alleged invention and its advantages do not reside, even partly, in the nature of the network that connects the set top box and the wireline node. The alleged invention is about partitioning at the wireline node, and about sending an alert that speech is potentially on the way from the user. What is in between is a matter of indifference, as long as it works.
135. The second general matter is the description of the specification and of the preferred embodiments.
136. Promptu accepts that the great majority of the preferred embodiments, and indeed of the description generally, is in the context of closed TV cable networks. This is clearly correct. But Promptu also says that there are some parts of the specification that are not so limited. For example, it refers to the discussion of local loops not necessarily using coaxial cable at [0021] and [0022], the reference to DOCSIS-type modems at [0071], the reference to Ethernet compatibility at [0081], the use of circuit switched telephony in the return path at [0082], and the connection of the network to the internet in Figure 3, bottom right.
137. There is no reason why the skilled addressee would think that these aspects of the teaching were to be excluded from the claims. They would expect that where general wording was used in the claims, then these aspects of the teaching would be included. The DOCSIS modems are a good example. Sky’s and Dr Robinson’s position on them wavered, but eventually settled (at least Dr Robinson) on the position that they were outside the claim because owing to the slots provided being allocated too dynamically (i.e. changing too often) there was no “back channel”.
138. Similarly, Sky took a contorted position on the connection to the internet in Figure 3. The skilled addressee would naturally take this as a connection to what Promptu called the “open” internet, i.e. the publicly available internet generally. Since that was inconsistent with Sky’s position that the network of claim 1 must be closed, Sky had to seek to limit the teaching of Figure 3. Dr Robinson suggested that it represented merely limited access to parts of the internet, controlled by the cable company running the network. There is no basis for this approach.
139. Each of these was Sky trying to put the cart before the horse, assuming a narrow meaning of the claim and then trying to find artificial reasons to exclude parts of the teaching. The teaching as a whole supports the general impression that the invention is indifferent to the specific type of network or physical cabling.
140. With these general matters in mind, I turn to the claim language. I have already said this issue spans multiple claim features. I will address some of them individually; other aspects require an overall view.
141. First, and because it qualifies the network as a whole, I will consider claim feature 1C - “supporting at least one of cable television delivery and video delivery”. The textual basis for this is in [0006], albeit in slightly different words (“cable television and/or video delivery”). Neither party’s argument was really satisfactory.
142. Counsel for Promptu submitted that the words were a broad description of the content provided and simply denoted the delivery of moving images. He submitted that this meant that “video delivery” was much broader than “cable television” and, indeed, subsumed it. I found this unsatisfactory in that the words “cable television” would be redundant.
143. Counsel for Sky submitted that the feature required that the network had to be able to support cable television and video delivery, hence it must be a cable TV network. That is unsatisfactory because it gives no meaning to “at least one of”. Part of the submission, however, was that “cable television” denoted broadcast-style, live television channels of the kind to which cable users are used to subscribing, and “video” delivery denoted services such as movies on demand. I accept these descriptions of the content.
144. In my view, this feature just means that the network must be able to support either or both of those types of content. But the feature is not about the kinds of physical cables to be used, provided they can support the content. So this feature does not help Sky in seeking to restrict the claim.
145. Next, I consider the requirement of claim feature 1A for a “back channel”. This merits individual consideration because it was contended to be a term of art. Counsel for Sky submitted that Dr Greaves accepted that it was such; the relevant evidence is at T1/74-75. I do not think that Dr Greaves did accept that it was a term of art, and indeed at T1/75 he said that it was an ordinary English term. What he meant in his initial apparent acceptance at T1/74 was that if used in the specific context of a closed TV cable network with coaxial cabling, “back channel” would be understood to refer to the low frequency band. He had said in his written evidence that there is also a “back channel” in satellite networks, where that narrow context did not apply.
146. So this point also does not help Sky, or at least not unless it could establish that other claim features provided such a specific context, in which case it would not need a narrow meaning of “back channel”.
147. In my view, looked at individually or in the context of the claim as a whole, “back channel” means a channel suitable to carry a multiplicity of speech signals from multiple set-top boxes to the wireline node of the claim. So far as it matters, Dr Robinson agreed that “back channel” could be used to describe a return path in this general manner.
148. There was some additional discussion in the evidence about whether a “back channel” could be a duplex channel. I did not find this helpful, or of separate significance. I certainly can see nothing in the specification to exclude information being sent in both directions to establish the back channel in the first place. It was common ground that the presence of a back channel implies the existence of a forward channel, but this does not advance either side’s argument.
149. Next, I consider feature 1E, “partitioning” the received back channel. This is perhaps where Sky’s argument comes closest to seeking to identify a feature in the claims whose words imply a limitation on the physical set up of the network. The argument was that “partitioning” was apt to describe the FDMA used in the coaxial cables of closed TV cable networks.
150. The problem with this argument is that it is clear from the structure and words of the claim that the “partitioning” takes place in the wireline node, when the system separates all the incoming traffic into the speech coming from each user. The wording is perhaps confusing, or at least inelegant, because although the system is reassembling the speech for each user, it is partitioning all that which it receives. But it is clear that what is being referred to is not multiplexing at the user end.
151. Finally, and in the context of the foregoing, I consider the noun “network” in feature 1B. Dr Robinson’s written evidence accepted (1st report, paragraph 232), and I agree, that in general a “network” might refer to any suitable network for delivering the relevant content. He went on to say, consistently with Sky’s case, that because most of the disclosure was about closed TV cable networks, the reader would think the claim was so limited. However, he recognised that other networks received some attention in the specification; I have given some examples above. This really just provides a round-up of the points I have already covered and illustrates that Sky’s approach was to seek to limit general terms by confining the claims to the specific teaching, and to just some of the preferred teaching at that.
152. I conclude that claim 13 as proposed to be amended is not limited to closed TV cable TV networks. It does not exclude networks which use the internet.
153. Feature 13 requires that “data are sent” to the wireline node alerting the system as to the user site and a potential input. The purpose of this is to reduce latency by telling the headend to get ready to process speech.
154. There is no basis in the claim for the limitation which Sky seeks to impose, namely that there must be a pre-existing connection over which the data has to be sent. Nor was any technical reason advanced to support why that should be so. Further, I agree with Promptu that since any connection in a packet-switched network is first established by sending some data, Sky’s approach does not make sense.
155. This does not mean that I am holding that sending any data in an effort to set up a connection falls within the claim: the data must identify the user site and a potential input, as the claim requires. What I am holding is simply that if the data also results in a connection being established, that is not outside the claim.
156. The Sky Q system was described in Sky’s PPD. Much of the detail is irrelevant to the issues I have to decide. All that really matters is that:
i) The Sky Q system includes several connected systems:
a) A VREX voice platform which is hosted on Amazon Web Servers (“AWS”) in the UK.
b) An ASR function provided by Google which turns voice data into text. This may or may not take place in the UK and Sky does not know for any individual instance whether it does or does not.
c) Sky Search: a content engine containing searchable metadata for the content available to the user. This is also hosted on AWS, but in Ireland.
ii) The systems are connected via the internet.
iii) In particular, when a user presses the voice button on their remote control, a WebSocket individual to that user is opened, which involves sending, among other things, the user’s IP address to the VREX.
iv) When, as will frequently be the case, multiple users activate voice command at the same time, the VREX will receive data from all of them and separate it out into each user’s speech using their IP addresses.
v) It is the VREX that sends the response back to the user, once it has received what it needs from the ASR and content engine.
157. I have simplified slightly. For example, some functions (called “use cases”) previously alleged to infringe are now accepted not to be within the claims, and sometimes the VREX itself can perform speech recognition. But those points of detail do not matter to the issues at this trial. At most they would affect the scope of any financial relief.
158. Sky accepted that if it was wrong on the two construction issues I have discussed above, then Sky Q fell within the scope of the claims as far as their meaning is concerned (i.e. leaving aside territoriality). Since I have indeed held against Sky on the two points, that conclusion does follow. In essence, this is because (a) the claim is not limited to closed TV cable TV networks but can extend to networks over the internet and (b) the initial set up of the WebSocket involves sending the user’s IP address and can be understood by the VREX as connoting that speech commands are imminent.
159. As I have identified, not all parts of the Sky Q system are in the UK. This means that some of the steps of proposed amended claim 13, which is a method claim, are performed in the UK and others are not. In the case of the ASR, it sometimes is and sometimes is not, and Sky does not know for any given instance whether it is or not.
160. This kind of situation has been considered in a number of previous cases. I was referred in particular to the judgment of Aldous LJ in Menashe v. William Hill [2002] EWCA 1702, the judgment of Arnold J (as he then was) in RIM v. Motorola [2010] EWHC 188 (Pat), and the judgment of Henry Carr J in Illumina Inc v Premaitha Health Plc [2017] EWHC 2930 (Pat).
161. In my view, the principles derivable from these cases are that (a) the Court’s task is to identify by whom and where, in substance, the method is being used; and (b) it is relevant to take into account that for some steps it simply may not matter where processing power is located.
162. In the present case, all features of the claim except the speech recognition and content engine access take place in the UK. The method is, overall, a method of using a back channel which takes place at the wireline node, i.e. at the “server” end, remote from the user. User input triggers part of the method, i.e. sending data from the user end identifying the user, but it is not the user who puts the method into effect.
163. The central part of the processing at the server end is the partitioning of incoming signals on the back channel and subsequent provision of the unique recognised speech content response and individual delivery of services accordingly. Content engine access and speech recognition are subordinate and in the Sky Q system are essentially sub-contracted; I consider that it is a matter of indifference where those two functions take place, an impression that is fortified by the fact that Sky do not even know where the ASR takes place for any given user interaction.
164. Overall therefore I am of the clear view that the method is performed, in substance, by Sky, in the UK.
165. Sky argued that the method was (a) indeed performed by Sky (as opposed to Sky Q customers), but that (b) it was not performed in the UK because of the ASR (sometimes) and the Sky Search being done abroad. I agree with the former, as I have already said, but disagree with the latter because of the subordinate role of those functions. Sky did not really engage with why those functions being abroad meant that performance of the method as a whole was abroad and in essence its argument seemed to be that there is no infringement in the UK purely because some parts of a method are done abroad. That is clearly not the right principle.
166. I think it is unnecessary and probably not right in principle to compare facts with the earlier cases and there are both similarities and differences in relation to the three I have mentioned above.
167. Counsel for Sky nonetheless submitted that this case was (most) like RIM, where it was found that there was no infringement, and was unlike Menashe, where the method was focused on the user’s end (“a gaming system for playing an interactive casino game”). Although I do not think comparing facts is the right way to proceed, I accept those parallels exist. They would not help Sky anyway, though, because proposed amended claim 13 is focused not on the user’s end but on the server end, and Arnold J held in Menashe that, for a claim to “a method of operating a messaging gateway system”, the method was done at the server. He held there was no infringement in the UK because the server was in Canada, and in the present case the VREX server, the important one, is in the UK.
168. Therefore, if the Patent had been valid, Sky would have infringed.
169. I will deal with added matter and then with Houser.
170. There was no dispute about the applicable legal principles, which can be found in a number of places. Both sides referred to Nokia v. IPCom [2012] EWCA Civ 567 and in particular, given the nature of the attacks, which are allegations of intermediate generalisation, to what Kitchin LJ, as he then was, said at [53] to [60]:
“53 Then, in decision T 0331/87, Houdaille/Removal of feature [1991] E.P.O.R. 194, the TBA laid down a three part test at [3]–[6]:
‘3. For the determination whether an amendment of a claim does or does not extend beyond the subject-matter of the application as filed, it is necessary to examine if the overall change in the content of the application originating from this amendment (whether by way of addition, alteration or excision) results in the skilled person being presented with information which is not directly and unambiguously derivable from that previously presented by the application, even when account is taken of matter which is implicit to a person skilled in the art in what has been expressly mentioned (Guidelines, Part C, Chapter VI, No. 5.4). In other words, it is to examine whether the claim as amended is supported by the description as filed.
4. In the decision T 260/85 (“Coaxial connector/AMP, OJ EPO, 1989, 105) the Board of Appeal 3.5.1 came to the conclusion that “it is not permissible to delete from a claim a feature which the application as originally filed consistently presents as being an essential feature of the invention, since this would constitute a violation of Art.123(2) EPC” (cf. Point 12 and Headnote). In that case the application as originally filed contained no express or implied disclosure that a certain feature (“air space”) could be omitted. On the contrary, the reasons for its presence were repeatedly emphasised in the specification. It would not have been possible to recognise the possibility of omitting the feature in question from the application (Point 8). It could be recognised from the facts that the necessity for the feature was associated with a web of statements and explanations in the specification, and that its removal would have required amendments to adjust the disclosure and some of the other features in the case.
5. Nevertheless it is also apparent that in other, perhaps less complicated technical situations, the omission of a feature and thereby the broadening of the scope of the claim may be permissible provided the skilled person could recognise that the problem solving effect could still be obtained without it (e.g. T 151/84 - 3.4.1 of 28 August 1987, unreported). As to the critical question of essentiality in this respect, this is a matter of given feasibility of removal or replacement, as well as the manner of disclosure by the applicant.
6. It is the view of the Board that the replacement or removal of a feature from a claim may not violate Art.123(2) EPC provided the skilled person would directly and unambiguously recognise that (1) the feature was not explained as essential in the disclosure, (2) it is not, as such, indispensable for the function of the invention in the light of the technical problem it serves to solve, and (3) the replacement or removal requires no real modification of other features to compensate for the change (following the decision in Case T 260/85, OJ EPO 1989, 105). The feature in question may be inessential even if it was incidentally but consistently presented in combination with other features of the invention. Any replacement by another feature must, of course, be examined for support in the usual manner (cf. Guidelines, Part C, Chapter VI, No. 5.4) with regard to added matter.’
54. Thus the skilled person must be able to recognise directly and unambiguously that (1) the feature is not explained as essential in the original disclosure, (2) it is not, as such, indispensable for the function of the invention in the light of the technical problem it serves to solve, and (3) the replacement or removal requires no real modification of other features to compensate for the change.
55. This test provides a convenient structured approach to the fundamental question whether, following amendment, the skilled person is presented with information about the invention which is not derivable directly and unambiguously from the original disclosure.
56. Turning to intermediate generalisation, this occurs when a feature is taken from a specific embodiment, stripped of its context and then introduced into the claim in circumstances where it would not be apparent to the skilled person that it has any general applicability to the invention.
57. Particular care must be taken when a claim is restricted to some but not all of the features of a preferred embodiment, as the TBA explained in decision T 0025/03 at point 3.3:
‘According to the established case law of the boards of appeal, if a claim is restricted to a preferred embodiment, it is normally not admissible under Article 123(2) EPC to extract isolated features from a set of features which have originally been disclosed in combination for that embodiment. Such kind of amendment would only be justified in the absence of any clearly recognisable functional or structural relationship among said features (see e.g. T 1067/97, point 2.1.3).’
58. So also, in decision T 0284/94 the TBA explained at points 2.1.3–2.1.5 that a careful examination is necessary to establish whether the incorporation into a claim of isolated technical features, having a literal basis of disclosure but in a specific technical context, results in a combination of technical features which is clearly derivable from the application as filed, and the technical function of which contributes to the solution of a recognisable problem. Moreover, it must be clear beyond doubt that the subject matter of the amended claim provides a complete solution to a technical problem unambiguously recognisable from the application.
59. It follows that it is not permissible to introduce into a claim a feature taken from a specific embodiment unless the skilled person would understand that the other features of the embodiment are not necessary to carry out the claimed invention. Put another way, it must be apparent to the skilled person that the selected feature is generally applicable to the claimed invention absent the other features of that embodiment.
60. Ultimately the key question is once again whether the amendment presents the skilled person with new information about the invention which is not directly and unambiguously apparent from the original disclosure. If it does then the amendment is not permissible.”
171. The EPO now places rather less weight on Houdaille, although it is still a permissible approach. This change of emphasis makes no difference to the present case.
172. I also remind myself that the standard is one of clear and unambiguous disclosure. Something that is obvious from the application or might be inferred from it is not good enough.
173. The relevant comparison is with the application as filed, which in this case is PCT/US01/14760.
174. Two added matter allegations remain (another having dropped away because it was a squeeze on construction against an argument that Promptu did not in the end make).
175. The first was that the “individually controlling” functionality to be found in claim feature 1H was only disclosed in the context of a closed TV cable system. In particular, Sky pointed to page 9 of the application as filed, lines 8 to 23:
“The invention comprises a multi-user control system for audio visual devices that incorporates a speech recognition system that is centrally located in or near a wireline node, and which may include a Cable Television (CATV) Headend. The speech recognition system may also be centrally located in or near a server farm a web-site hosting facility, or a network gateway.
In these embodiments of the invention, spoken commands from a cable subscriber are recognized and then acted upon to control the delivery of entertainment and information services, such as Video On Demand, Pay Per View, Channel control, on-line shopping, and the Internet. This system is unique in that the speech command which originates at the user site, often the home of the subscriber, is sent upstream via the return path (often five to 40 MHz) in the cable system to a central speech recognition and identification engine. The speech recognition and identification engine described herein is capable of processing thousands of speech commands simultaneously and offering a low latency entertainment, information, and shopping experience to the user or subscriber.”
176. The second allegation was that the content engine was only disclosed in the application in a particular context, being an “augmented” node or headend and that, relatedly, the augmented node (or headend) was taught as being of a kind disclosed in a co-pending application. Sky referred in particular to page 59, lines 13-20:
“As used herein, the adjective augmented is used to refer to a node incorporating at least one embodiment of the invention.
Augmented node 1310 may control and support optimized upstream communication as disclosed in the co-pending application serial number 09/679, 115, entitled "Increased Bandwidth in Aloha-based Frequency Hopping Transmission Systems" by Calderone and Foster, both inventors of this application and commonly assigned to AgileTV, and incorporated herein by reference.”
177. The teaching of a content engine is then found in, in particular, the embodiments of figures 23 and 26, which involve augmented nodes and headends, respectively.
178. Both these allegations suffer from the same problem, which is that although the features of the claims of the Patent in question are indeed disclosed in context with other features (respectively, a closed TV cable system, and the detail of the augmented node in the co-pending application), there is no disclosure, either in the passages relied on by Sky, or in the Application as a whole, that those other features are necessary, or that the features all must come as a package for some reason.
179. Turning to the “individually controlling” feature first, I consider that it is clear that what is important is the information flow from the user to a speech recognition system in the network, processing, and then provision of individualised content. The reader would clearly understand that although the system being referred to is a cable system of a specific kind, the information flow would work with other network types or physical set-ups, and in particular with other return paths. It may be noted, although a minor point overall, that it is merely said at page 9 line 18 that the return path is “often five to 40MHz”.
180. Furthermore, it is wrong to consider page 9 on its own. There are various other instances in the disclosure where networks other than the traditional closed TV cable network are contemplated. I have addressed them in dealing with the network issue on claim interpretation (I referred to the granted Patent but the same text appears in the Application). The overall teaching is very clearly that an invention is being explained in the context mainly of closed TV cable networks, but not limited to that context.
181. So there is no added matter in claiming “individually controlling” without also limiting the claim to cable television in a network. Another way of looking at it is that Sky implicitly says the added matter brought in by the claim’s terms is a teaching that it was not necessary to have a closed TV cable network to use the individually controlling feature. But there never was a teaching in the application as filed that it was necessary.
182. As to the content engine/augmented point, I think the application as filed is clear that an “augmented” node or headend is one where the invention is performed, by which it clearly means that the partitioning of the received back channel takes place - see e.g. claim 1 of the application. This is just to distinguish augmented nodes where that processing is done from nodes which merely pass on data; likewise an augmented headend just means that that processing is done there.
183. Thus the concept of a node where the key processing takes place is clearly disclosed in the application as filed, and was carried through to claim 1 of it, so there cannot be added matter in that, in itself. I am not sure that Sky really disputed this, hence its further reliance on the description at page 59 of the augmented node being of the particular type described in the co-pending application. But that cannot work as an added matter attack, because page 59 merely says that the augmented node may have those characteristics. It is not added matter thereafter to have a claim which does not require those characteristics.
184. So I reject the added matter attacks.
185. There was no dispute about the basic principles. I was referred to the principles set out in Actavis v. ICOS [2019] UKSC 15 at [52] - [73] by Lord Hodge, and to the structured approach from Pozzoli v. BDMO [2007] FSR 872. Sky relied on Brugger v. Medicaid [1996] RPC 635 at 661 (approved by the Supreme Court in Actavis v. ICOS) to the effect that an obvious route is not made less obvious by the existence of other obvious routes. However, in assessing obviousness the number of possible options available may be a relevant factor (see the statement of Kitchin J as he then was in Generics v. Lundbeck [2007] EWHC 1040 (Pat) at [72], also approved in Actavis v. ICOS).
186. Houser is entitled “Information System Having a Speech Interface”. Its area of interest is more specifically subscription television systems, video on demand, electronic guides and schedules and, in those contexts, speech command.
187. The assignee of Houser is Scientific Atlanta which, it was accepted by Dr Robinson, was a well-known maker of equipment for these kinds of applications. As a result, the skilled addressee would be more inclined (above and beyond the interest which is required as a matter of law) to take suggestions in it seriously.
188. Houser contains a number of worked embodiments, which are presented in a good level of detail. That too would lead the skilled addressee to give credence to them. Their description takes up most of the specification. They have a variety of set-ups and some have microphones in a remote control.
189. Two features mentioned in the context of these embodiments require specific mention because of the part they played in the argument before me.
190. First, at columns 15 and 16, in the contexts of the first and second hardware arrangements of figures 4 and 5, there are references to a “threshold element”.
191. Thus, column 15 lines 59-63 says:
“As a power-saving feature, a threshold element 310 may be 60 provided to sense when the sound level exceeds a certain level and enable interface 304 and other components only when sound which is potentially recognizable speech exists.”
192. And column 16 lines 25-31 say:
“As a power-saving feature, a threshold element (not shown) may be provided to sense when the sound level exceeds a certain level and to enable interface circuit 330 and other components only when sound which is potentially recognizable speech exists. A similar threshold element (not
shown) may also be provided in remote control 166, if desired.”
193. Second, an optional feature of a press to speak button is described at column 17 lines 16-22:
“Several optional features may be applied to each of the above-identified arrangements. First, on those remote controls which perform speech-related functions, a press to speak (or <Recognize>) button may be used to exclude
spurious noise and/or to extend battery life. Thus, the speech-related circuitry may be powered only when the press to speak button is pressed.”
194. An example remote control is shown in figure 9.
195. The sorts of commands available to the user are set out in a number of places, e.g. columns 18, 27 and 30 and include commands that the user would expect to see take effect very quickly, for an acceptable viewing experience.
196. Right at the end of the specification, just before the claims, a variation is suggested (column 33 lines 49-67):
“Other variations to the invention may also be made. For example, although the speech recognition operation is shown in the above embodiments as taking place at the subscriber terminal unit, this processing could take place
elsewhere in the system. One variation is shown in FIG. 15 in which a transmitter 515 transmits data representing sounds or spoken words to a node 517. Sounds or spoken words are received by a subscriber terminal unit 519. The sounds or spoken words are transmitted from subscriber terminal 519 to node 517 which includes speech recognition circuitry which uses the data transmitted from transmitter 515 to generate commands according to the sounds or spoken words. Node 517 transmits the command(s) to controlled device 521 via subscriber terminal unit 519 to control controlled device 521. If this arrangement is implemented in a subscription television system, for example, node 517 may be an off-premises device connected to a plurality of subscriber terminal units which access node 517 on a time-sharing basis.”
197. This refers to figure 15, which is as follows:
198. The point that is made is simply that the speech processing may, as an optional variation to that already described, be done at a shared computer in the network rather than in the subscribers’ equipment.
199. I have identified the skilled addressee and the CGK above.
200. I have referred already to Promptu’s concession that proposed amended claim 11 was invalid, to its implications, and to Promptu’s somewhat shifting position over it. This led to an argument over the proper approach to Pozzoli questions 2 and, especially, 3.
201. Sky’s position was that the difference between the prior art and the inventive concept ought to be regarded simply as the features of proposed amended claim 13 that were not present in proposed amended claim 11, i.e. just the feature that “upon depressing the talk button on the remote control data are sent to the wireline node alerting the system as to the user site and a potential input”.
202. Counsel for Promptu submitted that that was too simple and, indeed, ultimately, that the case was not susceptible to Pozzoli analysis.
203. While I accept that the Pozzoli analysis is not mandatory, it is very useful and widely applied. One optional aspect of the analysis comes in at question 2, because if it is too difficult or contentious to encapsulate the inventive concept and thereby potentially reduce “unnecessary verbiage” then the Court can simply work from the features of the claim. That does not arise here; the problem is not with verbiage but with identifying which claim features should be considered as being part of the gap that must be bridged from the prior art, given Promptu’s concession.
204. It should be noted that the basis for Counsel for Promptu submitting that this was not a case for a Pozzoli analysis was not a deficiency in the way Pozzoli works, but the confusion caused by Promptu’s surrender on all the claims down to and including proposed amended claim 11.
205. I think this is a case where Pozzoli analysis can and should be used. Promptu’s acceptance that it would not rely on the steps necessary to get to proposed amended claim 11, but that it reserved the right to attack the overall consistency of Sky’s case, means that it is possible to do so. In essence the relevant differences are simply those set out in the feature of claim 13 as quoted above.
206. There is a wrinkle to this in the sense that claim 11 already requires a talk button, so it might be argued on behalf of Sky that the discussion on proposed amended claim 13 must assume a talk button and the only question is what to do with it. The problem is that Promptu’s argument, as I understood it, was that the inconsistency with the threshold feature which it alleges, would deter the skilled addressee both (a) from having a talk button at all, and (b) if they did have one, from using it for the purpose identified in claim 13. In my view (b) is a legitimate point for Promptu to take, and (a) is not, in view of its concession.
207. However in the event, for reasons given below, I have felt able to conclude that the alleged inconsistency does not exist, and would have neither effect. So although I think it is right in principle to approach Pozzoli question 3 as I have indicated, it would not have helped Promptu if I had thought otherwise and been more accepting of its position.
208. Other claim features such as pre-processing in the remote control (11B) and RF signalling (11C) quite clearly could not be relied on by Promptu. It did not seek to do so.
209. My analysis of this question must be based on my decision on question 3, but the evidence was of course prepared when the scope of the battle was much wider. It is relevant to understand what the overall picture was at that stage, to understand the evidence in context.
210. Sky’s overall case based on Houser was that it was obvious to:
i) Choose the figure 15 variation so that the speech recognition processing would take place at a network node shared by multiple users.
ii) Opt for having a system with a microphone in a remote control.
iii) Choose to have pre-processing in the remote control.
iv) Choose to use RF signalling between the remote control and the set top box.
v) Choose to use the press to speak button.
vi) Send an alert to the network node when the press to speak button was pressed, to cut down on latency.
211. A key point is that the sixth step is not taught in Houser even as an option and must therefore be supported from the CGK. One can therefore understand why Promptu took its stand on proposed claim 13.
212. Key arguments made by Promptu were as follows:
i) That Sky’s attack was a stepwise Technograph approach and therefore illegitimate.
ii) That the skilled addressee would be attracted to the fleshed-out embodiments and not the more sketchily-described figure 15 approach; the skilled addressee would have confidence that the former had really been worked on, by an established company in the field.
iii) That Sky’s attack was unduly conceptual and not rooted in the real level of detail to be found in the teaching of the specific embodiments in Houser.
iv) That the press to speak button was merely an optional feature, disclosed after the threshold feature in the specification.
v) That the threshold feature was an important one, that its inclusion would be seen as inconsistent with the press to speak button, and that the former would be preferred over the latter.
vi) In relation to an alert of the specified kind being sent on pressing the speak button, there was no teaching to suggest it, that Houser showed no sign of even considering it, and that it was not supported by the CGK, or obvious, and that there were numerous other ways to deal with latency.
213. Promptu also sought to attack in cross-examination the steps concerning pre-processing and RF signalling in/from the remote control, but as I have already said those faded away entirely following the argument over Promptu’s concession, and were not legitimate in the light of it.
214. Although I have held that the relevant differences for Pozzoli purposes are those represented by the specific features of proposed amended claim 13 and not anything else that was already in claim 11, I think I should deal with the broad Technograph point made by Promptu, to address any general contention that the range of possibilities offered by Houser is a factor.
215. One of Sky’s key arguments in relation to the Technograph point was reliance on the principle, illustrated in Brugger, that it is not necessarily only the most attractive route forward from the prior art that is obvious in law; that there may be multiple obvious avenues, possibly even a large number. This principle must not be allowed to run out of control or to rule entirely out of consideration the sheer number of options that a piece of prior art offers, which may be significant in the right case.
216. In the context of Houser in the present case, it is right to recognise that it is a document which explicitly presents a number of options. There are five main embodiments, and on top of that the authors present features which can be chosen as well, such as the threshold and the press to talk button. The reader is essentially invited to consider combining them, and is told what the individual options will accomplish (which is not very difficult to understand in any event).
217. Promptu’s most fundamental point in relation to the number of options was the one that figure 15 is less well worked; an afterthought. But as I have already said, the point loses its force given that the experts’ agreement that having processing in the network would be seen as attractive. In addition, the explanation that accompanies figure 15 does not require the earlier teaching to be ignored or scrapped, merely modified.
218. Overall therefore I did not think the Technograph point taken by Promptu was at all convincing.
219. Promptu’s point that Sky’s attack was too conceptual also lacks force. It was reminiscent of the argument against attacks over common general knowledge alone - that they try to avoid inconvenient detail. But this was not an attack over common general knowledge alone. It was an attack over a specific citation and the detail of the citation was there to see, so if Promptu wanted to say that some of it was inconsistent with that proposed by Sky or otherwise inconvenient detail, it was able to do so. It did come up with the threshold/press to speak inconsistency, and I address that below. But I do not think that general assertions that Sky’s attack was too conceptual had any force where not supported by examples.
220. Therefore although I do not lose sight of the somewhat stepwise nature of Sky’s attack as part of the overall picture, the real meat of the argument does indeed come down to the alleged inconsistency between the threshold feature and press to speak, and whether from his or her CGK the skilled person would without invention arrive at the use of an alert as required by proposed amended claim 13.
221. As I have said, Promptu argued that the reader of Houser would give more credence to the threshold feature, would prefer it to the press to speak feature, and would think they were incompatible.
222. This perspective was not grounded in the evidence of Dr Greaves and came into the case only during the trial, in the cross-examination of Dr Robinson. I think it is a lawyer’s point and I was not impressed by it. My main reasons are as follows:
i) Both features are presented as options.
ii) The fact that the threshold is mentioned first in the specification would be of no real significance to the skilled addressee.
iii) I could not see why the features are inconsistent. Using the press to speak button would prevent activity until the button was pressed. Once the button was pressed, it would be possible for speech to have an effect, but the threshold would still be useful to prevent problems being caused by background noise below the threshold that was present at the time of pressing the button.
223. So I would have rejected this basic incompatibility argument on the merits. It also seems to me, as I have said, that it is inconsistent with Promptu’s concession that claim 11 is obvious over Houser, since claim 11 required a talk button.
224. Promptu had a subtler form of this argument, which was broadly to the effect that the press to talk button was incompatible with the use of a threshold in the narrower context of the alert feature of claim 13, because if an alert was sent as soon as the button was pressed then circuitry in the remote would have to be active, and in that case it would not be possible for the threshold feature to serve its intended purpose of saving battery life.
225. This point too came into the case only late and without support from Dr Greaves. Again, I think it was a lawyer’s point, and it lacks force first because of the primacy in the skilled addressee’s thinking that again must be given to the threshold (which is merely optional) and because it assumes, which I do not think was shown, that all the circuitry would have to be enabled to process an alert. It seems perfectly possible as a matter of logic that only part of the circuity would be needed to send an alert, and the rest could be left quiescent unless the threshold was exceeded. This is all rather speculative, I acknowledge, but that is because the point was poorly developed.
226. I have referred above to sending an alert. I bear in mind that this is a shorthand and that the feature requires data “alerting the system as to the user site and a potential input” to be sent “on depressing the talk button”.
227. Sky’s argument was that latency was something the skilled addressee would have to have in mind. I agree with this. It is not just obvious but inevitable that a careful but uninventive designer of a system of the kind taught in Houser would have to assess where latency would occur and what would contribute to it. That is supported by the DAVIC document to which I have referred in connection with my findings on CGK. Two sources of latency would be establishing a connection, and preparing the resources necessary to process speech once it arrived.
228. Sky said that there were only three options as to when to open a connection:
i) At the end of a speech utterance;
ii) At the beginning of a speech utterance;
iii) When the push to talk button is depressed.
229. This was probably an oversimplification, since other options included opening the connection when the threshold was passed (if it was in use), or having a connection that was always on.
230. Be that as it may, I found Dr Robinson’s consistent position that opening a connection as soon as possible would be a natural thing to do to be convincing, although the examples he gave were somewhat general. Dr Greaves accepted that the skilled addressee would definitely want to take round trip time into account.
231. Dr Greaves made other very considerable concessions concerning readying resources at the node. For example, at T1/12321-1259:
“Q. And they also, in order to reduce latency, would want to
consider readying resources at the node as soon as possible?
A. It is one decision that they might take, yes.
Q. And it would be a sensible decision to alert the speech
recognition system at the node as soon as possible in order to
ready the resources?
A. Well, not necessarily if it ties up resources that could be
used for something else. If we have a voice-activated system,
for instance, which is one of the options described in Houser,
then there could be all sorts of noise going in the household.
The sound level activated system could be sending a lot
of ----
MR. JUSTICE MEADE: Sorry to interrupt, but I think we are still
on the assumption that this is a push to talk.
A. Okay, so we are on push to talk.
MS. LANE: Yes, we are.
A. Even Houser, if we look at the relevant part, puts two level
sensitive detectors on top of the push to talk. If we find
the right paragraph, he does still consider avoiding waking up
the server even with the push to talk variation. This is
presumably to stop waking up the server with unnecessary
traffic.
Q. I am not saying it is the only sensible way of doing it. I am
just saying one sensible way of doing it would be to alert the
speech recognition system at the node as soon as possible in
order to ready resources?
A. I agree that it is one way. I think you said that we agree it
is a good idea. However, I am saying that I would just like
to qualify that yet again to say it is one possible way of
doing that.
Q. And a good way of alerting the speech recognition system as
soon as possible would be to send an alert when the recognise
button is pressed?
A. It is certainly something we could do. As I have said, I am
not sure that it is necessarily good.
Q. It is a sensible option to consider?
A. Yes. We would leave it on the table, certainly.”
232. Another important passage of cross-examination (albeit in a rather general context) was at T1/11222-11318:
“Q. But I do not think it would be a difficult thing for them to
think of, because they know that they have to get the
resources ready on this assumption and so the step that they
are taking is to think, "They will be ready sooner if I ask for them sooner”. There is nothing very special about that, is there?
A. If the skilled person is going to make it speaker-adaptive and
so on, then they might then realise -- might consider how much
data needs to be loaded when somebody starts speaking, they
might say, well, it requires an extra megabyte for this
household to be loaded in, or something like that. They would
work out how long does it take to load a megabyte off a disc
at that priority date, and they get some number of
milliseconds, and then they would know how much they could
speed up something that might arrive later by proactively
loading it now, yes.
Q. So that is something they are going to consider?
A. They might consider it, yes.
Q. They probably are going to consider it with that type of
situation like, for example, the speaker-adaptive technique?
A. If they get that far, yes.”
233. In his written evidence, Dr Greaves had raised the issue of bandwidth used by alerts and had said that the skilled addressee would not want to use up bandwidth sending alerts; that it would be better to wait until the whole speech message was ready to be sent. In cross-examination he retreated from this, and said that that would be “very bad”. This exchange followed (T1/14511-19):
“Q. So a better solution would be either to send an alert at the
start of the speech or when the push-to-talk button is
pressed?
A. It would be better than that very bad design point. Another
design point is to just send some slightly shorter packets and
use the initial ones as the alert for the subsequent ones.
Q. So that would be one option and another option would be to
send the alert when the button was pressed?
A. Yes.
Q. And both of those would be obvious choices?
A. I think it would be a matter of experimentation and to see
what worked well.
Q. The skilled person would try both of those things and both of
them would be sensible options?
A. Yes, so they would experiment and try them I think, yes.”
234. Promptu made the fair point that although it mentions the talk button, Houser does not contain any pointers about this sort of latency issue. However, that is understandable given the very fact that the figure 15 arrangement with speech processing in the node is a subsidiary option, less well fleshed out. So I take account of it, but it is of little weight. It does not mean that Houser had missed the idea, just not had to think about it.
235. Dr Robinson gave evidence that one obvious way to implement Houser with speech processing at the node would be by a DOCSIS modem using TCP/IP. This suggestion was made in his third report, and I have borne in mind the fact that at that stage he was very conscious of the issues over proposed amended claim 13 and, as I have already said, went too far in relation to the DAVIC point that he sought to make (based on the DAVIC Out Of Band return path - this is a different thing from the DAVIC latency budget document to which I have referred above). However, his evidence about DOCSIS was much better founded and Dr Greaves accepted it in large measure. In particular he accepted in terms at T1/146 that an obvious way to implement the node-based version of Houser would be the IP protocol with a DOCSIS modem, and after some discussion he accepted that in that case an option (though not to his own taste) would be to use TCP at the transport layer. He went on to accept that if DOCSIS with TCP/IP were to be used, an obvious time to set up the connection would be at the point when the talk button was pressed. He did say that “a reasonable amount of experimentation” would be needed, but it was not my impression that it would be anything out of the ordinary, or done speculatively or without a good expectation of making the set-up work.
236. Setting up the connection with TCP/IP would fall within the claim on the approach I have taken to infringement.
237. DOCSIS plus TCP/IP was just one aspect of Sky’s case, but I thought it was important because it was very concrete. It was a real, specific way of implementing Houser and provides a strong answer to Promptu’s argument that the attack was too abstract.
238. The evidence of the experts was not all entirely in Sky’s favour on these points. Dr Greaves did not budge on some things, and he gave a number of answers supportive of Promptu’s case in the course of the discussion. For example, on the DOCSIS plus TCP point his agreement was caveated at a number of junctures. But overall the evidence was in my view strongly in Sky’s favour, and early passages of cross-examination where Dr Greaves stood his ground were often followed by later statements where he made concessions once tested.
239. I also take into account that the argument over sending an alert and when to do it was also made by Sky in a somewhat stepwise fashion; Technograph again has to be considered. But I am satisfied that what Sky proposed was just systematic work: a latency “budget” would have to be prepared, issues such as preparing resources would inevitably be identified, and the appropriate response would be routinely identified.
240. So for all these reasons I find that proposed amended claim 13 is obvious over Houser.
241. I conclude that:
i) The added matter attacks fail.
ii) Proposed amended claim 13 is obvious over Houser and since no other claim is defended, the Patent is invalid and should be revoked.
iii) Had the Patent been valid, it would have been infringed by the operation of the Sky Q system.
242. I will hear Counsel as to the form of Order if it cannot be agreed. I direct that time for seeking permission to appeal shall not run until after the hearing on the form of Order (or the making of such Order if it is agreed).