一、什么是句柄

句柄是在操作系统中的一种标识符,相当于我们每个人的身份证一样,句柄在电脑中也是有唯一性的,我们启动的每一个程序都有自己的句柄号,表示自己的身份

为什么要说句柄,我们如果想做自动化操作时,肯定也不想程序占用了我们整个电脑,稍微操作一下程序步骤就乱掉了,更加希望自动化程序在运行的时候能够只针对某个窗口或者某个程序进行操作,即使我们把自动化的程序放入都后台时也不影响两边的操作,这里就需要用到句柄了

所需的包

#配置清华镜像源pip config set global.index-url https://pypi.tuna.tsinghua.edu.cn/simplepip config set global.trusted-host pypi.tuna.tsinghua.edu.cn #安装依赖库pip install pywin32

基本使用

#部分参考文档https://huaweicloud.csdn.net/63803058dacf622b8df86819.html?spm=1001.2101.3001.6661.1&utm_medium=distribute.pc_relevant_t0.none-task-blog-2~default~BlogCommendFromBaidu~activity-1-122498299-blog-111083068.pc_relevant_vip_default&depth_1-utm_source=distribute.pc_relevant_t0.none-task-blog-2~default~BlogCommendFromBaidu~activity-1-122498299-blog-111083068.pc_relevant_vip_default&utm_relevant_index=1

1、获取鼠标所在位置程序的句柄

import timeimport win32apiimport win32guitime.sleep(2)point = win32api.GetCursorPos()#win32api.GetCursorPos 获取鼠标当前的坐标(x,y)hwnd = win32gui.WindowFromPoint(point)#查看坐标位置窗口的句柄print(hwnd)#输出句柄

如下图,我执行了3遍分别在执行后将鼠标放在文本、桌面、idea上面,返回了句柄ID

2、通过句柄获取类名

我们每次关闭重新打开一个程序会发现句柄值变了,每次都从头找句柄就太麻烦了

每一个程序在开发之初就存在着一个叫”类名”的概念,类名和句柄每次变更不同,它在定义后几乎是不会发生变化的,所以我们最好是先找到一个程序的类名,后续直接通过类名找到句柄,然后在通过句柄进行真正所需要的操作

vi main.py

import timeimport win32apiimport win32gui# 通过句柄获取窗口类名def get_clasname(hwnd):clasname = win32gui.GetClassName(hwnd)print('窗口类名:%s' % (clasname))return clasnametime.sleep(2)point = win32api.GetCursorPos()hwnd = win32gui.WindowFromPoint(point)#查看窗口类名get_clasname(hwnd)

可以看到上面我们获取到了文档窗口的类名,现在开始我们直接通过类名去获取句柄

3、通过类名获取句柄

没有找到特定的方法,我们下面大概的思路就是先把主机上所有的句柄id都拿到,通过循环把所有句柄id的类名拿出来然后做对比,对的上的id都留在列表中,所以说如果开启了多个相同程序的窗口,我们也会获取到多个句柄

import timeimport win32apiimport win32gui#获取当前主机上的所有句柄iddef get_all_windows():all_window_handles = []# 枚举所有窗口句柄,添加到列表中def enum_windows_proc(hwnd, param):param.append(hwnd)return True# 调用枚举窗口APIwin32gui.EnumWindows(enum_windows_proc, all_window_handles)return all_window_handles#返回的是一个句柄id的列表#查询传入的句柄id、类名def get_title(window_handle, class_name):#查询句柄的类名window_class = win32gui.GetClassName(window_handle)#判断窗口类名是否和指定的类名相同,如果相同则返回该窗口句柄,否则返回空值if window_class == class_name:return window_handle#遍历窗口句柄的所有子窗口def get_child_windows(parent_window_handle):child_window_handles = []def enum_windows_proc(hwnd, param):param.append(hwnd)return True#win32gui.EnumChildWindows遍历窗口句柄的所有子窗口win32gui.EnumChildWindows(parent_window_handle, enum_windows_proc, child_window_handles)return child_window_handles# 根据标题查找窗口句柄def find_hwnd_by_title(title):all_windows = get_all_windows()#查询所有句柄matched_windows = []#存放所有匹配类名的句柄id# 在所有窗口中查找标题匹配的窗口句柄for window_handle in all_windows:#get_title方法检查传入句柄对应的类名和我们实际的类名是否对应window_title = get_title(window_handle, title)if window_title:matched_windows.append(window_title) #如果对应就写入列表# 如果没有匹配到,则在所有子窗口中查找标题匹配的窗口句柄if matched_windows:return matched_windowselse:child_window_handles = []for parent_window_handle in all_windows:#不论子窗口是否有数据都追加到列表child_window_handles.extend(get_child_windows(parent_window_handle))for child_window_handle in child_window_handles:if get_title(child_window_handle, title):matched_windows.append(get_title(child_window_handle, title))return matched_windowsif __name__ == '__main__':hwnd = find_hwnd_by_title("Edit")print(hwnd)

可以看到我们能够直接取到相同类名下所有已经打开的窗口句柄,这样我们甚至可以做个循环加多线程,来实现一个窗口并发的效果

二、模拟按键

上面我们已经拿到了文本文档的一个句柄信息,通过句柄我们可以做很多事情,最常见的就是模拟鼠标和键盘的按键操作,每个操作可能都较为细小琐碎,我们定义一个class类来存放

常见消息类型和标识

#官方参考https://learn.microsoft.com/zh-cn/windows/win32/inputdev/wm-lbuttondown
消息类型作用消息标识作用
WM_MOUSEMOVE鼠标 移动移动通用左键右键标识
WM_RBUTTONDOWN鼠标 右键按下MK_RBUTTON左键按下
WM_RBUTTONUP鼠标 右键释放None释放时无需标识
WM_LBUTTONDOWN鼠标 左键按下MK_LBUTTON右键按下
WM_LBUTTONUP鼠标 左键释放None释放时无需标识

当按键需要被按下时,需要先声明消息类型,然后标明按键状态
如果鼠标按键需要被释放时,可以直接通过释放按钮来释放
如果指定消息类型是移动时,可以当作已经声明了消息类型,可以直接使用按键标识

使用语法

#在win32api下有个函数PostMessage,是用来与windows api交互的,参数如下1、要发送消息的目标窗口的句柄2、发送的消息类型3、以及消息的参数 win32api.PostMessage(句柄id, 消息类型, 消息标识, 具体的坐标(x,y))

获取目标坐标

#获取坐标time.sleep(3)print(win32api.GetCursorPos())

返回

(328, 250)

鼠标按键案例

下面定义了一个类,先去接受我们上面获取到的句柄id,在使用鼠标按键的时候调用win32api.PostMessage函数 去发送给句柄所在的窗口按键信息

在左右键按下的时候才需要定义标识,比如模拟左键时会使用WM_LBUTTONDOWN和MK_LBUTTON ,而松开时使用WM_LBUTTONUP和None

变量pos 是只鼠标按键的坐标,需要通过win32api.MAKELONG 转换数据类型后才能调用

#声明鼠标操作的类class WinMouse(object):#初始化函数,接受传入的句柄iddef __init__(self, handle_num: int):self.handle = handle_num#鼠标左键按下def left_button_down(self, pos):win32api.PostMessage(self.handle, win32con.WM_LBUTTONDOWN, win32con.MK_LBUTTON, pos)#鼠标左键释放def left_button_up(self, pos):win32api.PostMessage(self.handle, win32con.WM_LBUTTONUP, None, pos)if __name__ == '__main__':hwnd = find_hwnd_by_title("Edit") #通过类名获取句柄bd = WinMouse(hwnd[0])#实例化WinMouse 类,传入句柄值pos = win32api.MAKELONG(328, 250) #将正常的x,y坐标值转换为特定的数据结构, #给win32api.PostMessage调用#按下、等待1s、松开bd.left_button_down(pos)time.sleep(1) bd.left_button_up(pos)

可以看到在下图中,我们运行程序后,不论文本文档是否在前台还是后台,哪怕被遮挡住后也会照常进行鼠标点击(数字太多看不清,大致就是我把鼠标放到末尾,程序在我上面取坐标的地方点一下左键)

其他按键补全

#按下鼠标左键并移动def mouse_move(self, pos):win32api.PostMessage(self.handle, win32con.WM_MOUSEMOVE, win32con.MK_LBUTTON, pos)#按下鼠标右键并移动def right_button_move(self, pos):win32api.PostMessage(self.handle, win32con.WM_MOUSEMOVE, win32con.MK_RBUTTON, pos)#指定坐标按下右键def right_button_down(self, pos):win32api.PostMessage(self.handle, win32con.WM_RBUTTONDOWN, win32con.MK_RBUTTON, pos)#右键释放def right_button_up(self, pos):win32api.PostMessage(self.handle, win32con.WM_RBUTTONUP, None, pos)#模拟左键双击def left_double_click(self, x_pos:int, y_pos:int, click=2, wait=0.4):wait = wait / click#click 表示点击次数,wait是的等待时间,意思是双击的间隔point = win32api.MAKELONG(x_pos, y_pos)for i in range(click):self.left_button_down(point)time.sleep(wait)self.left_button_up(point)#右键双击def right_doubleClick(self, x, y, click=2, wait=0.4):wait = wait / clickpos = win32api.MAKELONG(x, y)for i in range(click):self.right_button_down(pos)time.sleep(wait)self.right_button_up(pos)

按键组合函数

上面用一个按左键都要好几行,我们这里在给封装一下

#让他可以直接接收x,y坐标,wait是松开按键的间隔,一般默认即可#左键单击def left_click(self, x_pos:int, y_pos:int, wait=0.2):point = win32api.MAKELONG(x_pos, y_pos)self.left_button_down(point)time.sleep(wait)self.left_button_up(point)#右键单击def right_click(self, x_pos:int, y_pos:int, wait=0.2):point = win32api.MAKELONG(x_pos, y_pos)self.right_button_down(point)time.sleep(wait)self.right_button_up(point)#模拟左键双击def left_double_click(self, x_pos:int, y_pos:int, click=2, wait=0.4):wait = wait / click#click 表示点击次数,wait是的等待时间,意思是双击的间隔point = win32api.MAKELONG(x_pos, y_pos)for i in range(click):self.left_button_down(point)time.sleep(wait)self.left_button_up(point)#右键双击def right_doubleClick(self, x, y, click=2, wait=0.4):wait = wait / clickpos = win32api.MAKELONG(x, y)for i in range(click):self.right_button_down(pos)time.sleep(wait)self.right_button_up(pos)

鼠标滑动拖拽

添加偏移值

vi main.py

#计算鼠标从起始点到目标点的偏移过程def getPointOnLine(start_x, start_y, end_x, end_y, ratio):x = ((end_x - start_x) * ratio) + start_xy = ((end_y - start_y) * ratio) + start_yreturn int(round(x)), int(round(y))class WinMouse(object):def __init__(self, handle_num: int, num_of_steps=80): #添加num_of_steps=80self.handle = handle_numself.num_of_steps = num_of_steps#添加偏移值

添加左右键拖动方法

#模拟点击并拖拽目标,接受两对坐标值def left_click_move(self, x1:int, y1:int, x2:int, y2:int, wait=2):point1 = win32api.MAKELONG(x1, y1)self.left_button_down(point1)#起始点按下鼠标左键#获取我们在init初始化时定义的偏移值steps = self.num_of_steps#调用我们上面的方法返回具体,循环0-80的值#你看这里的循环值是80,也就说会做80次循环操作#我们传入了起始坐标和目标坐标,而i / steps就相当于起始到结束的偏移位置#可以理解为从左上角到右下角的点points = [getPointOnLine(x1, y1, x2, y2,i / steps) for i in range(steps)]points.append((x2, y2))wait_time = wait / stepsunique_points = list(set(points))unique_points.sort(key=points.index)for point in unique_points:x, y = pointpoint = win32api.MAKELONG(x, y)self.mouse_move(point)time.sleep(wait_time)self.left_button_up(point)#右键单击并滑动批量勾选(与上方函数同理)def right_click_move(self, start_x, start_y, end_x, end_y, wait=2):pos = win32api.MAKELONG(start_x, start_y)self.right_button_down(pos)steps = self.num_of_stepspoints = [getPointOnLine(start_x, start_y, end_x, end_y, i / steps) for i in range(steps)]points.append((end_x, end_y))time_per_step = wait / stepsdistinct_points = list(set(points))distinct_points.sort(key=points.index)for point in distinct_points:x, y = pointpos = win32api.MAKELONG(x, y)self.right_button_move(pos)time.sleep(time_per_step)self.right_button_up(pos)

演示

bd.left_click_move(109,180,232,341)

全量代码

import timeimport win32apiimport win32conimport win32gui#---------------------------------------------------句柄配置的分割线#获取当前主机上的所有句柄iddef get_all_windows():all_window_handles = []# 枚举所有窗口句柄,添加到列表中def enum_windows_proc(hwnd, param):param.append(hwnd)return True# 调用枚举窗口APIwin32gui.EnumWindows(enum_windows_proc, all_window_handles)return all_window_handles#返回的是一个句柄id的列表#查询传入的句柄id、类名def get_title(window_handle, class_name):#查询句柄的类名window_class = win32gui.GetClassName(window_handle)#判断窗口类名是否和指定的类名相同,如果相同则返回该窗口句柄,否则返回空值if window_class == class_name:return window_handle#遍历窗口句柄的所有子窗口def get_child_windows(parent_window_handle):child_window_handles = []def enum_windows_proc(hwnd, param):param.append(hwnd)return True#win32gui.EnumChildWindows遍历窗口句柄的所有子窗口win32gui.EnumChildWindows(parent_window_handle, enum_windows_proc, child_window_handles)return child_window_handles# 根据标题查找窗口句柄def find_hwnd_by_title(title):all_windows = get_all_windows()#查询所有句柄matched_windows = []#存放所有匹配类名的句柄id# 在所有窗口中查找标题匹配的窗口句柄for window_handle in all_windows:#get_title方法检查传入句柄对应的类名和我们实际的类名是否对应window_title = get_title(window_handle, title)if window_title:matched_windows.append(window_title) #如果对应就写入列表# 如果没有匹配到,则在所有子窗口中查找标题匹配的窗口句柄if matched_windows:return matched_windowselse:child_window_handles = []for parent_window_handle in all_windows:#不论子窗口是否有数据都追加到列表child_window_handles.extend(get_child_windows(parent_window_handle))for child_window_handle in child_window_handles:if get_title(child_window_handle, title):matched_windows.append(get_title(child_window_handle, title))return matched_windows#-----------------------------------------------------句柄配置的分割线def getPointOnLine(start_x, start_y, end_x, end_y, ratio):x = ((end_x - start_x) * ratio) + start_xy = ((end_y - start_y) * ratio) + start_yreturn int(round(x)), int(round(y))#声明鼠标操作的类class WinMouse(object):#初始化函数,接受传入的句柄iddef __init__(self, handle_num: int, num_of_steps=80):self.handle = handle_numself.num_of_steps = num_of_steps#鼠标左键按下def left_button_down(self, pos):win32api.PostMessage(self.handle, win32con.WM_LBUTTONDOWN, win32con.MK_LBUTTON, pos)#鼠标左键释放def left_button_up(self, pos):win32api.PostMessage(self.handle, win32con.WM_LBUTTONUP, None, pos)#按下鼠标左键并移动def mouse_move(self, pos):win32api.PostMessage(self.handle, win32con.WM_MOUSEMOVE, win32con.MK_LBUTTON, pos)#按下鼠标右键并移动def right_button_move(self, pos):win32api.PostMessage(self.handle, win32con.WM_MOUSEMOVE, win32con.MK_RBUTTON, pos)#指定坐标按下右键def right_button_down(self, pos):win32api.PostMessage(self.handle, win32con.WM_RBUTTONDOWN, win32con.MK_RBUTTON, pos)#右键释放def right_button_up(self, pos):win32api.PostMessage(self.handle, win32con.WM_RBUTTONUP, None, pos)#--------------------------------------------------------封装按键方法的分割线#让他可以直接接收x,y坐标,wait是松开按键的间隔,一般默认即可#左键单击def left_click(self, x_pos:int, y_pos:int, wait=0.2):point = win32api.MAKELONG(x_pos, y_pos)self.left_button_down(point)time.sleep(wait)self.left_button_up(point)#右键单击def right_click(self, x_pos:int, y_pos:int, wait=0.2):point = win32api.MAKELONG(x_pos, y_pos)self.right_button_down(point)time.sleep(wait)self.right_button_up(point)#模拟左键双击def left_double_click(self, x_pos:int, y_pos:int, click=2, wait=0.4):wait = wait / click#click 表示点击次数,wait是的等待时间,意思是双击的间隔point = win32api.MAKELONG(x_pos, y_pos)for i in range(click):self.left_button_down(point)time.sleep(wait)self.left_button_up(point)#右键双击def right_doubleClick(self, x, y, click=2, wait=0.4):wait = wait / clickpos = win32api.MAKELONG(x, y)for i in range(click):self.right_button_down(pos)time.sleep(wait)self.right_button_up(pos)#模拟点击并拖拽目标,接受两对坐标值#模拟点击并拖拽目标,接受两对坐标值def left_click_move(self, x1:int, y1:int, x2:int, y2:int, wait=2):point1 = win32api.MAKELONG(x1, y1)self.left_button_down(point1)#起始点按下鼠标左键#获取我们在init初始化时定义的偏移值steps = self.num_of_steps#调用我们上面的方法返回具体,循环0-80的值#你看这里的循环值是80,也就说会做80次循环操作#我们传入了起始坐标和目标坐标,而i / steps就相当于起始到结束的偏移位置#可以理解为从左上角到右下角的点points = [getPointOnLine(x1, y1, x2, y2,i / steps) for i in range(steps)]points.append((x2, y2))wait_time = wait / stepsunique_points = list(set(points))unique_points.sort(key=points.index)for point in unique_points:x, y = pointpoint = win32api.MAKELONG(x, y)self.mouse_move(point)time.sleep(wait_time)self.left_button_up(point)#右键单击并滑动批量勾选(与上方函数同理)def right_click_move(self, start_x, start_y, end_x, end_y, wait=2):pos = win32api.MAKELONG(start_x, start_y)self.right_button_down(pos)steps = self.num_of_stepspoints = [getPointOnLine(start_x, start_y, end_x, end_y, i / steps) for i in range(steps)]points.append((end_x, end_y))time_per_step = wait / stepsdistinct_points = list(set(points))distinct_points.sort(key=points.index)for point in distinct_points:x, y = pointpos = win32api.MAKELONG(x, y)self.right_button_move(pos)time.sleep(time_per_step)self.right_button_up(pos)if __name__ == '__main__':hwnd = find_hwnd_by_title("Edit") #通过类名获取句柄bd = WinMouse(hwnd[0])#实例化WinMouse 类,传入句柄值bd.left_click_move(109,180,232,341)

三、准备opencv环境

我们上面简单的实现了鼠标后台模拟操作,但是存在一个问题,坐标都是我们固定好的,而在实际使用中,我们的要点击的坐标一定是多变的,我们更希望当我们要做自动化操作的时候,给他一个图片,然后就能拿到对应点击位置的一个坐标

安装模块

pip install opencv-pythonpip install pillowpip install opencv-contrib-pythonpip install numpypip install PIL

我这边在安装上opencv-python 后调用cv2下任意方法都提示报黄,没有代码提示,下面列出解决方法

(venv) PS C:\Users\Administrator\IdeaProjects\test> pip install opencv-pythonLooking in indexes: https://pypi.tuna.tsinghua.edu.cn/simpleRequirement already satisfied: opencv-python in c:\users\administrator\ideaprojects\test\venv\lib\site-packages (4.7.0.72)Requirement already satisfied: numpy>=1.21.2 in c:\users\administrator\ideaprojects\test\venv\lib\site-packages (from opencv-python) (1.24.3)

我们在安装成功opencv-python模块后会返回一个安装路径,登录这个路径,进入cv2的目录下,将cv2.pyd 文件放到下面的路径下,重启编辑器即可

c:\users\administrator\ideaprojects\test\venv\lib\site-packages

测试语句

import cv2# 加载一张图片img = cv2.imread('11.png', 1) #脚本文件旁边自行准备一个图片# 显示图片cv2.imshow('image', img)cv2.waitKey(0)cv2.destroyAllWindows()

1、添加截图函数

import ctypesimport timeimport cv2import numpy as npimport win32apiimport win32conimport win32guiimport win32uifrom PIL import Imagedef get_bitmap(hwnd):"""获取窗口的位图并返回图像对象。参数:hwnd (int): 窗口的句柄。返回:Image: 窗口的位图图像。"""# 获取窗口的位置和大小信息left, top, right, bot = win32gui.GetWindowRect(hwnd)# 获取窗口的左上角和右下角坐标w = right - left# 计算窗口的宽度h = bot - top# 计算窗口的高度# 获取窗口的设备上下文(DC)hwnd_dc = win32gui.GetWindowDC(hwnd)# 获取窗口的设备上下文mfc_dc = win32ui.CreateDCFromHandle(hwnd_dc)# 创建一个与窗口设备上下文兼容的设备上下文save_dc = mfc_dc.CreateCompatibleDC()# 创建一个与窗口设备上下文兼容的设备上下文# 创建与窗口大小相同的兼容位图save_bitmap = win32ui.CreateBitmap()# 创建位图对象save_bitmap.CreateCompatibleBitmap(mfc_dc, w, h)# 创建与窗口大小相同的兼容位图# 将位图选择到设备上下文中save_dc.SelectObject(save_bitmap)# 将位图选择到设备上下文中# 将窗口内容绘制到位图中windll.user32.PrintWindow(hwnd, save_dc.GetSafeHdc(), 0)# 将窗口内容绘制到位图中# 获取位图的信息和位图数据bmpinfo = save_bitmap.GetInfo()# 获取位图的信息bmpstr = save_bitmap.GetBitmapBits(True)# 获取位图的数据# 使用位图数据创建图像对象bmp = Image.frombuffer('RGB', (bmpinfo['bmWidth'], bmpinfo['bmHeight']), bmpstr, 'raw', 'BGRX', 0, 1)# 使用位图数据创建图像对象# 清理资源win32gui.DeleteObject(save_bitmap.GetHandle())# 删除位图对象save_dc.DeleteDC()# 删除设备上下文mfc_dc.DeleteDC()# 删除设备上下文win32gui.ReleaseDC(hwnd, hwnd_dc)# 释放设备上下文# 将位图保存到文件(用于调试目的)bmp.save("asdf.png")# 将位图保存到文件return bmp# 返回位图图像对象#调用代码if __name__ == '__main__':hwnd = find_hwnd_by_title("AfxFrameOrView140u")# 通过类名获取句柄# bd = WinMouse(hwnd[0])#实例化WinMouse 类,传入句柄值# bd.left_click_move(109,180,232,341)#调用截图函数并返回位图print(get_bitmap(hwnd[0]))

这里我截图换成了xshell的句柄类名

突然感觉还是用桌面演示比较好,下面我代码改下类名,不记得怎么获取类名的去看上面

import ctypesimport timeimport cv2import numpy as npimport win32apiimport win32conimport win32guiimport win32uifrom PIL import Image# ---------------------------------------------------句柄配置的分割线# 获取当前主机上的所有句柄iddef get_all_windows():all_window_handles = []# 枚举所有窗口句柄,添加到列表中def enum_windows_proc(hwnd, param):param.append(hwnd)return True# 调用枚举窗口APIwin32gui.EnumWindows(enum_windows_proc, all_window_handles)return all_window_handles# 返回的是一个句柄id的列表# 查询传入的句柄id、类名def get_title(window_handle, class_name):# 查询句柄的类名window_class = win32gui.GetClassName(window_handle)# 判断窗口类名是否和指定的类名相同,如果相同则返回该窗口句柄,否则返回空值if window_class == class_name:return window_handle# 遍历窗口句柄的所有子窗口def get_child_windows(parent_window_handle):child_window_handles = []def enum_windows_proc(hwnd, param):param.append(hwnd)return True# win32gui.EnumChildWindows遍历窗口句柄的所有子窗口win32gui.EnumChildWindows(parent_window_handle, enum_windows_proc, child_window_handles)return child_window_handles# 根据标题查找窗口句柄def find_hwnd_by_title(title):all_windows = get_all_windows()# 查询所有句柄matched_windows = []# 存放所有匹配类名的句柄id# 在所有窗口中查找标题匹配的窗口句柄for window_handle in all_windows:# get_title方法检查传入句柄对应的类名和我们实际的类名是否对应window_title = get_title(window_handle, title)if window_title:matched_windows.append(window_title)# 如果对应就写入列表# 如果没有匹配到,则在所有子窗口中查找标题匹配的窗口句柄if matched_windows:return matched_windowselse:child_window_handles = []for parent_window_handle in all_windows:# 不论子窗口是否有数据都追加到列表child_window_handles.extend(get_child_windows(parent_window_handle))for child_window_handle in child_window_handles:if get_title(child_window_handle, title):matched_windows.append(get_title(child_window_handle, title))return matched_windows# -----------------------------------------------------句柄配置的分割线def getPointOnLine(start_x, start_y, end_x, end_y, ratio):x = ((end_x - start_x) * ratio) + start_xy = ((end_y - start_y) * ratio) + start_yreturn int(round(x)), int(round(y))# 声明鼠标操作的类class WinMouse(object):# 初始化函数,接受传入的句柄iddef __init__(self, handle_num: int, num_of_steps=80):self.handle = handle_numself.num_of_steps = num_of_steps# 鼠标左键按下def left_button_down(self, pos):win32api.PostMessage(self.handle, win32con.WM_LBUTTONDOWN, win32con.MK_LBUTTON, pos)# 鼠标左键释放def left_button_up(self, pos):win32api.PostMessage(self.handle, win32con.WM_LBUTTONUP, None, pos)# 按下鼠标左键并移动def mouse_move(self, pos):win32api.PostMessage(self.handle, win32con.WM_MOUSEMOVE, win32con.MK_LBUTTON, pos)# 按下鼠标右键并移动def right_button_move(self, pos):win32api.PostMessage(self.handle, win32con.WM_MOUSEMOVE, win32con.MK_RBUTTON, pos)# 指定坐标按下右键def right_button_down(self, pos):win32api.PostMessage(self.handle, win32con.WM_RBUTTONDOWN, win32con.MK_RBUTTON, pos)# 右键释放def right_button_up(self, pos):win32api.PostMessage(self.handle, win32con.WM_RBUTTONUP, None, pos)# --------------------------------------------------------封装按键方法的分割线# 让他可以直接接收x,y坐标,wait是松开按键的间隔,一般默认即可# 左键单击def left_click(self, x_pos: int, y_pos: int, wait=0.2):point = win32api.MAKELONG(x_pos, y_pos)self.left_button_down(point)time.sleep(wait)self.left_button_up(point)# 右键单击def right_click(self, x_pos: int, y_pos: int, wait=0.2):point = win32api.MAKELONG(x_pos, y_pos)self.right_button_down(point)time.sleep(wait)self.right_button_up(point)# 模拟左键双击def left_double_click(self, x_pos: int, y_pos: int, click=2, wait=0.4):wait = wait / click# click 表示点击次数,wait是的等待时间,意思是双击的间隔point = win32api.MAKELONG(x_pos, y_pos)for i in range(click):self.left_button_down(point)time.sleep(wait)self.left_button_up(point)# 右键双击def right_doubleClick(self, x, y, click=2, wait=0.4):wait = wait / clickpos = win32api.MAKELONG(x, y)for i in range(click):self.right_button_down(pos)time.sleep(wait)self.right_button_up(pos)# 模拟点击并拖拽目标,接受两对坐标值# 模拟点击并拖拽目标,接受两对坐标值def left_click_move(self, x1: int, y1: int, x2: int, y2: int, wait=2):point1 = win32api.MAKELONG(x1, y1)self.left_button_down(point1)# 起始点按下鼠标左键# 获取我们在init初始化时定义的偏移值steps = self.num_of_steps# 调用我们上面的方法返回具体,循环0-80的值# 你看这里的循环值是80,也就说会做80次循环操作# 我们传入了起始坐标和目标坐标,而i / steps就相当于起始到结束的偏移位置# 可以理解为从左上角到右下角的点points = [getPointOnLine(x1, y1, x2, y2, i / steps) for i in range(steps)]points.append((x2, y2))wait_time = wait / stepsunique_points = list(set(points))unique_points.sort(key=points.index)for point in unique_points:x, y = pointpoint = win32api.MAKELONG(x, y)self.mouse_move(point)time.sleep(wait_time)self.left_button_up(point)# 右键单击并滑动批量勾选(与上方函数同理)def right_click_move(self, start_x, start_y, end_x, end_y, wait=2):pos = win32api.MAKELONG(start_x, start_y)self.right_button_down(pos)steps = self.num_of_stepspoints = [getPointOnLine(start_x, start_y, end_x, end_y, i / steps) for i in range(steps)]points.append((end_x, end_y))time_per_step = wait / stepsdistinct_points = list(set(points))distinct_points.sort(key=points.index)for point in distinct_points:x, y = pointpos = win32api.MAKELONG(x, y)self.right_button_move(pos)time.sleep(time_per_step)self.right_button_up(pos)import win32gui# 导入 win32gui 模块,用于窗口相关操作import win32ui# 导入 win32ui 模块,用于设备上下文相关操作from ctypes import windll# 导入 windll 模块,用于调用动态链接库函数from PIL import Image# 导入 Image 模块,用于图像处理def get_bitmap(hwnd):"""获取窗口的位图并返回图像对象。参数:hwnd (int): 窗口的句柄。返回:Image: 窗口的位图图像。"""# 获取窗口的位置和大小信息left, top, right, bot = win32gui.GetWindowRect(hwnd)# 获取窗口的左上角和右下角坐标w = right - left# 计算窗口的宽度h = bot - top# 计算窗口的高度# 获取窗口的设备上下文(DC)hwnd_dc = win32gui.GetWindowDC(hwnd)# 获取窗口的设备上下文mfc_dc = win32ui.CreateDCFromHandle(hwnd_dc)# 创建一个与窗口设备上下文兼容的设备上下文save_dc = mfc_dc.CreateCompatibleDC()# 创建一个与窗口设备上下文兼容的设备上下文# 创建与窗口大小相同的兼容位图save_bitmap = win32ui.CreateBitmap()# 创建位图对象save_bitmap.CreateCompatibleBitmap(mfc_dc, w, h)# 创建与窗口大小相同的兼容位图# 将位图选择到设备上下文中save_dc.SelectObject(save_bitmap)# 将位图选择到设备上下文中# 将窗口内容绘制到位图中windll.user32.PrintWindow(hwnd, save_dc.GetSafeHdc(), 0)# 将窗口内容绘制到位图中# 获取位图的信息和位图数据bmpinfo = save_bitmap.GetInfo()# 获取位图的信息bmpstr = save_bitmap.GetBitmapBits(True)# 获取位图的数据# 使用位图数据创建图像对象bmp = Image.frombuffer('RGB', (bmpinfo['bmWidth'], bmpinfo['bmHeight']), bmpstr, 'raw', 'BGRX', 0, 1)# 使用位图数据创建图像对象# 清理资源win32gui.DeleteObject(save_bitmap.GetHandle())# 删除位图对象save_dc.DeleteDC()# 删除设备上下文mfc_dc.DeleteDC()# 删除设备上下文win32gui.ReleaseDC(hwnd, hwnd_dc)# 释放设备上下文# 将位图保存到文件(用于调试目的)bmp.save("asdf.png")# 将位图保存到文件return bmp# 返回位图图像对象def get_clasname(hwnd):clasname = win32gui.GetClassName(hwnd)print('窗口类名:%s' % (clasname))return clasnameif __name__ == '__main__':hwnd = find_hwnd_by_title("SysListView32")# 通过类名获取句柄# bd = WinMouse(hwnd[0])#实例化WinMouse 类,传入句柄值# bd.left_click_move(109,180,232,341)#调用截图函数并返回位图print(get_bitmap(hwnd[0]))

程序截图效果

2、通过局部图片分析全局坐标位置

现在我们已经能拿到指定程序的截图,现在我们需要知道鼠标应该点这个窗口的具体什么位置,比方说我想要点桌面的谷歌浏览器,就需要提前对谷歌截个图,然后分析图片在全局图片中的具体坐标

import cv2import numpy as np# 读取大图和小图large_image = cv2.imread('asdf.png')small_image = cv2.imread('111.png')# 将小图匹配到大图中result = cv2.matchTemplate(large_image, small_image, cv2.TM_CCOEFF)min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(result)# 获取小图在大图中的坐标top_left = max_locbottom_right = (top_left[0] + small_image.shape[1], top_left[1] + small_image.shape[0])# 在大图中标记出小图的位置cv2.rectangle(large_image, top_left, bottom_right, (0, 255, 0), 2)# 显示标记后的大图cv2.imshow('Result', large_image)cv2.waitKey(0)cv2.destroyAllWindows()

可以看到成功定位到了图片所在的位置,下面我们通过定位获取到坐标

3、获取匹配坐标

from ctypes import wintypesfrom PIL import Image# 获取指定窗口的大小和位置def get_window_rect(hwnd):try:# 调用win api获取窗口属性(位置、大小、状态)f = ctypes.windll.dwmapi.DwmGetWindowAttributeexcept WindowsError:f = Noneif f:# 创建结构体存储窗口的状态(这个结构体通常包含四个整数成员:left、top、right、bottom,分别表示矩形区域的左边界、上边界、右边界和下边界的坐标值)# https://blog.csdn.net/jxlhljh/article/details/129815925rect = wintypes.RECT()DWMWA_EXTENDED_FRAME_BOUNDS = 9# ctypes.wintypes.HWND(hwnd)将我们传入的句柄转换成windows api定义的句柄类型# ctypes.wintypes.DWORD(DWMWA_EXTENDED_FRAME_BOUNDS)指定获取窗口扩展边界信息# ctypes.byref(rect)将获取到的窗口属性信息写入到这个结构体中# ctypes.sizeof(rect)获取 RECT 结构体的大小f(wintypes.HWND(hwnd),wintypes.DWORD(DWMWA_EXTENDED_FRAME_BOUNDS),ctypes.byref(rect),ctypes.sizeof(rect))return [rect.left, rect.top, rect.right, rect.bottom]# 定义一个窗口操作的类class windowControl():def __init__(self, hwnd):self.hwnd = hwnd# 传入句柄# Target 图片位置# A 窗口位置def window_capture(self, Target, windowPosition, zqd=0.99):# 获取句柄窗口的大小# 获取窗口的宽高w_A = windowPosition[2] - windowPosition[0]h_A = windowPosition[3] - windowPosition[1]# 使用Windows API进行窗口图像的捕获和处理,创建了位图对象,并使用BitBlt函数将窗口图像复制到位图对象中hwndDC = win32gui.GetWindowDC(self.hwnd)mfcDC = win32ui.CreateDCFromHandle(hwndDC)saveDC = mfcDC.CreateCompatibleDC()saveBitMap = win32ui.CreateBitmap()saveBitMap.CreateCompatibleBitmap(mfcDC, w_A, h_A)saveDC.SelectObject(saveBitMap)saveDC.BitBlt((0, 0), (w_A, h_A), mfcDC, (windowPosition[0], windowPosition[1]), win32con.SRCCOPY)###获取位图信息bmpinfo = saveBitMap.GetInfo()bmpstr = saveBitMap.GetBitmapBits(True)###生成图像im_PIL_TEMP = Image.frombuffer('RGB', (bmpinfo['bmWidth'], bmpinfo['bmHeight']), bmpstr, 'raw', 'BGRX', 0, 1)# 使用OpenCV将图像转换为可处理的格式img = cv2.cvtColor(np.asarray(im_PIL_TEMP), cv2.COLOR_RGB2BGR)target = imgtemplate = cv2.imread(Target)theight, twidth = template.shape[:2]# 获得模板图片的高宽尺寸# 执行模板匹配,采用的匹配方式cv2.TM_SQDIFF_NORMEDresult = cv2.matchTemplate(target, template, cv2.TM_SQDIFF_NORMED)# 归一化处理cv2.normalize(result, result, 0, 1, cv2.NORM_MINMAX, -1)# 寻找矩阵(一维数组当做向量,用Mat定义)中的最大值和最小值的匹配结果及其位置min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(result)strmin_val = str(min_val)# 释放句柄、设备上下文、hwndDCwin32gui.DeleteObject(saveBitMap.GetHandle())mfcDC.DeleteDC()saveDC.DeleteDC()win32gui.ReleaseDC(self.hwnd, hwndDC)if abs(float(strmin_val)) <= (1 - zqd) and min_loc[0] != 0 and min_loc[1] != 0:return min_loc[0] + windowPosition[0], min_loc[1] + windowPosition[1]else:return 0, 0if __name__ == '__main__':hwnd = find_hwnd_by_title("SysListView32")bd = WinMouse(hwnd[0])# bd.left_double_click(395,619) #400 450# 初始化窗口类w = windowControl(hwnd[0])#指定窗口检查坐标#(0, 0, 1920, 1080) 是窗口地址0 0 表示最左边和最上面,19201080是最右边和最下面#我因为是在桌面操作,就按照最大值跑,后面用小窗口就直接引用窗口的大小# 0.95是匹配度,模糊匹配95%的都符合result_x, result_y = w.window_capture("111.png", (0, 0, 1920, 1080), 0.95)print("目标图像在窗口中的位置坐标:", result_x, result_y)# 不知为何,匹配出来的坐标始终差一点,先手动调整了if result_x - 50 < 0:x = result_xelse:x = result_x - 50if result_y - 50 < 0:y = result_yelse:y = result_y - 50# 鼠标点击bd.left_double_click(x, y)

实测中存在一定的坐标偏移,目前没找到啥好方法,在研究研究

全量如下

import ctypesimport timeimport cv2import numpy as npimport win32apiimport win32conimport win32guiimport win32uifrom PIL import Image# ---------------------------------------------------句柄配置的分割线# 获取当前主机上的所有句柄iddef get_all_windows():all_window_handles = []# 枚举所有窗口句柄,添加到列表中def enum_windows_proc(hwnd, param):param.append(hwnd)return True# 调用枚举窗口APIwin32gui.EnumWindows(enum_windows_proc, all_window_handles)return all_window_handles# 返回的是一个句柄id的列表# 查询传入的句柄id、类名def get_title(window_handle, class_name):# 查询句柄的类名window_class = win32gui.GetClassName(window_handle)# 判断窗口类名是否和指定的类名相同,如果相同则返回该窗口句柄,否则返回空值if window_class == class_name:return window_handle# 遍历窗口句柄的所有子窗口def get_child_windows(parent_window_handle):child_window_handles = []def enum_windows_proc(hwnd, param):param.append(hwnd)return True# win32gui.EnumChildWindows遍历窗口句柄的所有子窗口win32gui.EnumChildWindows(parent_window_handle, enum_windows_proc, child_window_handles)return child_window_handles# 根据标题查找窗口句柄def find_hwnd_by_title(title):all_windows = get_all_windows()# 查询所有句柄matched_windows = []# 存放所有匹配类名的句柄id# 在所有窗口中查找标题匹配的窗口句柄for window_handle in all_windows:# get_title方法检查传入句柄对应的类名和我们实际的类名是否对应window_title = get_title(window_handle, title)if window_title:matched_windows.append(window_title)# 如果对应就写入列表# 如果没有匹配到,则在所有子窗口中查找标题匹配的窗口句柄if matched_windows:return matched_windowselse:child_window_handles = []for parent_window_handle in all_windows:# 不论子窗口是否有数据都追加到列表child_window_handles.extend(get_child_windows(parent_window_handle))for child_window_handle in child_window_handles:if get_title(child_window_handle, title):matched_windows.append(get_title(child_window_handle, title))return matched_windows# -----------------------------------------------------句柄配置的分割线def getPointOnLine(start_x, start_y, end_x, end_y, ratio):x = ((end_x - start_x) * ratio) + start_xy = ((end_y - start_y) * ratio) + start_yreturn int(round(x)), int(round(y))# 声明鼠标操作的类class WinMouse(object):# 初始化函数,接受传入的句柄iddef __init__(self, handle_num: int, num_of_steps=80):self.handle = handle_numself.num_of_steps = num_of_steps# 鼠标左键按下def left_button_down(self, pos):win32api.PostMessage(self.handle, win32con.WM_LBUTTONDOWN, win32con.MK_LBUTTON, pos)# 鼠标左键释放def left_button_up(self, pos):win32api.PostMessage(self.handle, win32con.WM_LBUTTONUP, None, pos)# 按下鼠标左键并移动def mouse_move(self, pos):win32api.PostMessage(self.handle, win32con.WM_MOUSEMOVE, win32con.MK_LBUTTON, pos)# 按下鼠标右键并移动def right_button_move(self, pos):win32api.PostMessage(self.handle, win32con.WM_MOUSEMOVE, win32con.MK_RBUTTON, pos)# 指定坐标按下右键def right_button_down(self, pos):win32api.PostMessage(self.handle, win32con.WM_RBUTTONDOWN, win32con.MK_RBUTTON, pos)# 右键释放def right_button_up(self, pos):win32api.PostMessage(self.handle, win32con.WM_RBUTTONUP, None, pos)# --------------------------------------------------------封装按键方法的分割线# 让他可以直接接收x,y坐标,wait是松开按键的间隔,一般默认即可# 左键单击def left_click(self, x_pos: int, y_pos: int, wait=0.2):point = win32api.MAKELONG(x_pos, y_pos)self.left_button_down(point)time.sleep(wait)self.left_button_up(point)# 右键单击def right_click(self, x_pos: int, y_pos: int, wait=0.2):point = win32api.MAKELONG(x_pos, y_pos)self.right_button_down(point)time.sleep(wait)self.right_button_up(point)# 模拟左键双击def left_double_click(self, x_pos: int, y_pos: int, click=2, wait=0.4):wait = wait / click# click 表示点击次数,wait是的等待时间,意思是双击的间隔point = win32api.MAKELONG(x_pos, y_pos)for i in range(click):self.left_button_down(point)time.sleep(wait)self.left_button_up(point)# 右键双击def right_doubleClick(self, x, y, click=2, wait=0.4):wait = wait / clickpos = win32api.MAKELONG(x, y)for i in range(click):self.right_button_down(pos)time.sleep(wait)self.right_button_up(pos)# 模拟点击并拖拽目标,接受两对坐标值# 模拟点击并拖拽目标,接受两对坐标值def left_click_move(self, x1: int, y1: int, x2: int, y2: int, wait=2):point1 = win32api.MAKELONG(x1, y1)self.left_button_down(point1)# 起始点按下鼠标左键# 获取我们在init初始化时定义的偏移值steps = self.num_of_steps# 调用我们上面的方法返回具体,循环0-80的值# 你看这里的循环值是80,也就说会做80次循环操作# 我们传入了起始坐标和目标坐标,而i / steps就相当于起始到结束的偏移位置# 可以理解为从左上角到右下角的点points = [getPointOnLine(x1, y1, x2, y2, i / steps) for i in range(steps)]points.append((x2, y2))wait_time = wait / stepsunique_points = list(set(points))unique_points.sort(key=points.index)for point in unique_points:x, y = pointpoint = win32api.MAKELONG(x, y)self.mouse_move(point)time.sleep(wait_time)self.left_button_up(point)# 右键单击并滑动批量勾选(与上方函数同理)def right_click_move(self, start_x, start_y, end_x, end_y, wait=2):pos = win32api.MAKELONG(start_x, start_y)self.right_button_down(pos)steps = self.num_of_stepspoints = [getPointOnLine(start_x, start_y, end_x, end_y, i / steps) for i in range(steps)]points.append((end_x, end_y))time_per_step = wait / stepsdistinct_points = list(set(points))distinct_points.sort(key=points.index)for point in distinct_points:x, y = pointpos = win32api.MAKELONG(x, y)self.right_button_move(pos)time.sleep(time_per_step)self.right_button_up(pos)def get_clasname(hwnd):clasname = win32gui.GetClassName(hwnd)print('窗口类名:%s' % (clasname))return clasnamefrom ctypes import wintypesfrom PIL import Image# 获取指定窗口的大小和位置def get_window_rect(hwnd):try:# 调用win api获取窗口属性(位置、大小、状态)f = ctypes.windll.dwmapi.DwmGetWindowAttributeexcept WindowsError:f = Noneif f:# 创建结构体存储窗口的状态(这个结构体通常包含四个整数成员:left、top、right、bottom,分别表示矩形区域的左边界、上边界、右边界和下边界的坐标值)# https://blog.csdn.net/jxlhljh/article/details/129815925rect = wintypes.RECT()DWMWA_EXTENDED_FRAME_BOUNDS = 9# ctypes.wintypes.HWND(hwnd)将我们传入的句柄转换成windows api定义的句柄类型# ctypes.wintypes.DWORD(DWMWA_EXTENDED_FRAME_BOUNDS)指定获取窗口扩展边界信息# ctypes.byref(rect)将获取到的窗口属性信息写入到这个结构体中# ctypes.sizeof(rect)获取 RECT 结构体的大小f(wintypes.HWND(hwnd),wintypes.DWORD(DWMWA_EXTENDED_FRAME_BOUNDS),ctypes.byref(rect),ctypes.sizeof(rect))return [rect.left, rect.top, rect.right, rect.bottom]# 定义一个窗口操作的类class windowControl():def __init__(self, hwnd):self.hwnd = hwnd# 传入句柄# Target 图片位置# A 窗口位置def window_capture(self, Target, windowPosition, zqd=0.99):# 获取句柄窗口的大小# 获取窗口的宽高w_A = windowPosition[2] - windowPosition[0]h_A = windowPosition[3] - windowPosition[1]# 使用Windows API进行窗口图像的捕获和处理,创建了位图对象,并使用BitBlt函数将窗口图像复制到位图对象中hwndDC = win32gui.GetWindowDC(self.hwnd)mfcDC = win32ui.CreateDCFromHandle(hwndDC)saveDC = mfcDC.CreateCompatibleDC()saveBitMap = win32ui.CreateBitmap()saveBitMap.CreateCompatibleBitmap(mfcDC, w_A, h_A)saveDC.SelectObject(saveBitMap)saveDC.BitBlt((0, 0), (w_A, h_A), mfcDC, (windowPosition[0], windowPosition[1]), win32con.SRCCOPY)###获取位图信息bmpinfo = saveBitMap.GetInfo()bmpstr = saveBitMap.GetBitmapBits(True)###生成图像im_PIL_TEMP = Image.frombuffer('RGB', (bmpinfo['bmWidth'], bmpinfo['bmHeight']), bmpstr, 'raw', 'BGRX', 0, 1)# 使用OpenCV将图像转换为可处理的格式img = cv2.cvtColor(np.asarray(im_PIL_TEMP), cv2.COLOR_RGB2BGR)target = imgtemplate = cv2.imread(Target)theight, twidth = template.shape[:2]# 获得模板图片的高宽尺寸# 执行模板匹配,采用的匹配方式cv2.TM_SQDIFF_NORMEDresult = cv2.matchTemplate(target, template, cv2.TM_SQDIFF_NORMED)# 归一化处理cv2.normalize(result, result, 0, 1, cv2.NORM_MINMAX, -1)# 寻找矩阵(一维数组当做向量,用Mat定义)中的最大值和最小值的匹配结果及其位置min_val, max_val, min_loc, max_loc = cv2.minMaxLoc(result)strmin_val = str(min_val)# 释放句柄、设备上下文、hwndDCwin32gui.DeleteObject(saveBitMap.GetHandle())mfcDC.DeleteDC()saveDC.DeleteDC()win32gui.ReleaseDC(self.hwnd, hwndDC)if abs(float(strmin_val)) <= (1 - zqd) and min_loc[0] != 0 and min_loc[1] != 0:return min_loc[0] + windowPosition[0], min_loc[1] + windowPosition[1]else:return 0, 0if __name__ == '__main__':hwnd = find_hwnd_by_title("SysListView32")bd = WinMouse(hwnd[0])# bd.left_double_click(395,619) #400 450# 初始化窗口类w = windowControl(hwnd[0])#指定窗口检查坐标#(0, 0, 1920, 1080) 是窗口地址0 0 表示最左边和最上面,19201080是最右边和最下面#我因为是在桌面操作,就按照最大值跑,后面用小窗口就直接引用窗口的大小# 0.95是匹配度,模糊匹配95%的都符合result_x, result_y = w.window_capture("111.png", (0, 0, 1920, 1080), 0.95)print("目标图像在窗口中的位置坐标:", result_x, result_y)# 不知为何,匹配出来的坐标始终差一点,先手动调整了if result_x - 50 < 0:x = result_xelse:x = result_x - 50if result_y - 50 < 0:y = result_yelse:y = result_y - 50# 鼠标点击bd.left_double_click(x, y)